BUG Rese LE 
=S 

i Fi 
Vv. 


Theoretical Physics 


An Advanced Text 


Volume 1: 


MAGNETIC FIELD 


ELECTRO 
ATIVITY 


THEORY OF REL 





THEORY OF THE 





Benjamin G. Levich 
O D l D 


Electrochemistty 
ciences of the 


USSR, Moscow 9. 


Jnstitute of 
Academy ofS 





professional. use. 








For your 








pe ee 
ical Physics 


An Advanced Text 


Theoret 


Volume 1: 


AGNETIC FIELD 





THEORY OF THE ELECTROM 
THEORY OF RELATIVITY 
Benjamin G. Levich 





ry 


USSR, Moscow 


nstitute of Flectrochemist 


I 
cademy of Sciences of the 


A 












plotz ere 
109980 








2 rte. 
7/9 


~) 


Theoretical Physics 


An Advanced Text 


Volume 1: 


THEORY OF THE ELECTROMAGNETIC FIELD 
THEORY OF RELATIVITY 


Benjamin G. Levich 





Institute of Electrochemistry 6 Q Nn 
Academy of Sciences of the USSR, Moscow g! San à 








A 
NTH 






For’ your professional. use. 


bof (476 
109986 


a 


© NORTH-HOLLAND PUBLISHING COMPANY, AMSTERDAM, 1970 


All rights reserved: No part of this book may be reproduced, stored in a retrieval 
system, or transmitted, in any form or by any means, electronic, mechanical, photo- 
copying, recording or otherwise without the prior permission of the Copyright owner. 


Library of Congress Catalog Card Number: 68 54501 


ISBN complete set: 0 7204 0176 3 
Vol. 1: 0 7204 0177 1 


Printed in The Netherlands ` 


Title of the Russian edition: 
KURS TEORETICHESKOJ FIZIKI 


Russian edition published by: 
IZDATELSTVO ‘NAUKA’, GLAVNAJA REDAKCIJA, 


FIZIKO- MATEMATICESKOJ LITERATURY (MOSKVA, 1969) 


Bea 


EE ————————— 


— Lo e r aam 


FOREWORD 


The first Russian edition of ‘Theoretical Physics’, which appeared in 1962, 
has been widely used as a textbook. 

Numerous comments from colleagues, lecturers and students have been 
taken into account in preparing this new edition, which is the first one in 
English and which will also appear as the second Russian edition. 

The material has now been divided into 4 volumes covering the following 
subjects 


Volume 1 
PartI Theory of the Electromagnetic Field 
Part II Theory of Relativity 


Volume 2 
Part III Statistical Physics 
Part IV Electromagnetic Processes in Matter 


Volume 3 
Part V Quantum Mechanics | 


Volume 4 
Part VI Quantum Statistics and Physical Kinetics 


The rapid development of physics and the present wide interest in 
non-equilibrium and non-stationary processes has compelled us to expand the 
section on physical. kinetics. It has also been transferred to the end of ! 
Volume 4 as it is practically impossible to expound this topic without using 
quantum mechanics. 4 
Part IV — ‘Electromagnetic Processes. in Matter’ — has been substantially 
revised. Interest in this field has increased recently, mainly in connection with 
the study of plasmas and plasma-like media, which now have sections devoted 
to them. 





vi FOREWORD 


The methods of calculating electrostatic and direct-current fields, and 
other problems of classical electrodynamics in a medium, are covered very 
briefly as we have assumed that students will be able to consult the many 
monographs and handbooks on general physics, electrical- and radio- 
technology, and the equations of mathematical physics. 

As for other modifications and additions, we should draw attention to the 
introduction of tensor notation, to new ideas in the theories of relativity and 
electromagnetic fields, the broadening of the introduction to the theory of 
probability, a brief presentation of the method of correlation functions in 
Statistical physics, the exposition of the thermodynamic theory of ferro- 
magnetism and the theory of propagation of electromagnetic waves in plasma. 
A number of paragraphs have been rewritten. We have tried to bring the 
content of the book even clpser to the interests of present-day theoretical 
physics. : 

The general level of the book has been preserved and it is still intended to 
form an introduction to theoretical physics. Problems requiring the use of 
cumbersome or special mathematical apparatus are still excluded, and the 
most difficult sections are marked by an asterisk. These may be skipped at 
will, since there is no reference to them in the main text. 

In conclusion we would like to express our gratitude to all those who 
helped us in preparing this book, in particular to A.M. Brodsky, A.M. 
Golovin, B.M. Grafov, R.R. Dogonadze, V.S. Krylov and especially V.S. 
Markin and V.V. Tolmachev. I.V. Savelyev discovered a number of misprints 
which have now been corrected. 

L:D. Konkina helped us in editing the manuscript. 

We are grateful to the readers and students who used the first Russian 
edition of the book-for sending us their valuable comments which have been 
taken into account in this edition. 


August 1970 


FOREWORD TO THE FIRST RUSSIAN EDITION 


The continuous development of theoretical physics and the regular 
expansion of its areas of application create increasing demand for textbooks 
and manuals. 

The rapid development and the complexity of the most recent experi- 
mental methods of physical investigation, and the corresponding development 
and extension of the mathematical apparatus of theoretical physics, have 
meant that one man usually cannot. combine the two methods of investiga- 
tion. The end of the 19th century and particularly the 20th century therefore 
saw physicists divided into ‘experimentalists’ and ‘theoreticians’, the latter 
studying physical laws by means of the mathematical methods of theoretical 
physics. 

Ovviously, a background in theoretical physics is essential in the education 
of experimental as well as theoretical physicists. 

The experimental and theoretical methods of physical investigation have 
penetrated into a number of branches of science related to physics (physical 
chemistry, biophysics, geophysics, astrophysics, and so on) and into technolo- 
gy (metal physics and metallurgical science, thermophysics, electrical technol- 
ogy, radiotechnology, computation, the instrument-making industry etc.). 
Workers in these branches of science and technology also need a certain 
minimum knowledge of theoretical physics. 

The compilation of a modern. textbook on theoretical physics is inevitably \ 
associated with certain logical and methodological difficulties. It is impossible 
at present to divide theoretical physics into classical and quantum parts so 
that it is also impossible to divide it into separate chapters and sections. For 
example, thé exposition of statistical physics without taking into account the 
quantum properties of atomic systems is impossible, for it would mean that 
the general theory remained, without practical application. In the theory of 
electromagnetic processes in matter one has of necessity to make use of the ' 
ideas of statistical :physics, and so on. It may be that the maximum 
consistency of composition would be obtained if the book were founded on 


vii 


e 





viii FOREWORD TO THE FIRST RUSSIAN EDITION 


quantum mechanics but this is completely inadmissible in a book intended as 
an introductory treatise. Quantum mechanics requires a certain preparedness 
and the student must be convinced of the necessity of renouncing obvious 
classical representations. Compromise solutions, which have justified them- 
selves during many years of-teaching theoretical physics at the Moscow 
Engineering-Physical Institute and Moscow State University, are therefore 
inevitable. 

The following general principles have been applied. 

(1) The book is written as an introduction to theoretical physics so that 
aspects requiring the use of cumbersome or special mathematical apparatus 
have not been included. 

(2) As it is to be used for a systematic study of the subject the course is a 
unique whole and all material necessary for understanding the later sections is 
contained in the earlier ones. 

(3) It would not be feasible to elucidate experimental facts in addition to 
problems concerning purely theoretical physics. However, physics is a single 
science, and an attempt to expound the theoretical aspects without taking 
experiment into account would be quite wrong. The reader is assumed to 
have some basic experimental knowledge from university courses in general 
and atomic physics so that we have confined ourselves to references and, in a 
few instances, to a schematic description of basic experiments. 

(4) The acquaintance assumed with general courses in general and atomic 
physics has allowed us to rely on a certain (very restricted) knowledge of 
quantum mechanics in our treatment of statistical physics. 

(5) Classical mechanics usually forms a separate course so that this topic 
has been omitted although detailed reference has been made to handbooks of 
mechanics. 

(6) The book similarly does not cover hydrodynamics, aerodynamics, the 
theory of heat transfer, or problems related to electrical- and radio- 
technology. — 

(7) Detailed reference is made to mathematical manuals. The mathematical 
apparatus utilized, except in the sections marked by an asterisk, is covered by 
the usual -courses in analysis. In the case of quantum mechanics, however, the 
mathematical apparatus has been included, since it is of a specific character 
and is not taught in traditional mathematical courses. 

(8) As the book is intended as a systematic course in theoretical physics no 
attempt has been made to achieve the same level of accessibility in all 
sections. It is a well-known fact that a student’s comprehension and 
assimilation of difficult material increases as a course progresses, and that this 
is also true for the associated mathematical apparatus. Moreover, experi- 


` 


FOREWORD TO THE FIRST RUSSIAN EDITION ix 


mental physicists will constantly encounter new problems in quantum 
mechanics which can only be handled using advanced methods of treatment. 
The section on quantum mechanics (Part V) therefore deals with some topics 
having a more advanced character than those in other sections. The analysis 
of applications of the kinetic equations is similarly treated rather extensively. 


The uniqueness of the book’s objectives has affected the content of individual 
sections, so that some topics in modern physics have been included at the 
expense of more traditional material. 

Part I contains the foundations of the theory of the electromagnetic field 
in a vacuum, based on the system of Maxwell-Lorentz equations. A basic 
knowledge of electromagnetism is assumed. The focus of attention is the 
theory of radiation and the motion of charged particles in external fields. 

In Part II, devoted to the theory of relativity, a four-dimensional form of 
representation is adopted which not only corresponds to the spirit of the 
theory but also predominates in contemporary literature. The problems of 
dynamics in the theory of relativity are treated in some detail. A number of 
the most recent applications of the theory of relativity, particularly those 
related to nuclear physics, are covered here for the first time in a textbook. 

Part III is a revised version of Levich’s ‘Introduction to Statistical Physics’ 
and treats statistical physics and the fundamentals of statistical thermo- 
dynamics. Classical thermodynamics would require too much space, and did 
not seem indispensable. 

Part IV contains the theory of electromagnetic processes in matter. 
Relatively little attention is paid to problems in theoretical electrical- and 
radio-technology. The phenomenological theory of electric and magnetic 
properties of matter is analyzed in some detail, and the notion of the physics 
of the plasma state of matter is given. 

In Part V the basic ideas of present-day relativistic quantum mechanics are 
included as well as the traditional problems of non-relativistic quantum 
mechanics. Applications to solid-state theory are considered at length. 

Part VI contains the essential concepts of physical kinetics, which are not 
usually presented in a general course on theoretical physics. 


The experience of teaching theoretical physics shows that the greatest 
difficulties are often encountered not in understanding new physical ideas but 
in the actual mathematical treatments. All mathematical operations have 
therefore been performed in sufficient detail. 

For convenience we have presented a brief derivation of those formulae of 





















Da 





x FOREWORD TO THE FIRST RUSSIAN EDITION 


vector analysis which are encountered throughout, as well as the necessary 
data on Fourier integrals and 5-function theory. 

The numbering of formulae and sections starts afresh in each Part and 
references to appendices have been given Roman numerals. 

The author hopes that the readers, after making themselves familiar with 
the foundations of theoretical physics expounded in this book, will be able to 


_ proceed to a more profound study using the many-volume treatise of Landau 


and Lifshitz. The scientific and educational ideas of their work were of great 
influence on the author, who is a disciple of Landau. 

Parts I-IV and Part VI were written by B.G. Levich. Part V was written by 
Y.A. Vdovin and V.A. Myamlin under the general scientific guidance of B.G. 
Levich. Chapter XV * of Part V was written by A.I. Naumov. 

The author expresses his gratitude to the colleagues who read the book 


and the manuscripts, and. made a number of valuable remarks: B.M. Grafov,- 


R.R. Dogonadze, V.A. Kiryanov, V.S. Krylov, V.S. Markin, V.P. Smilga, Y.A. 
Chizmadzhev and Y.I. Yalamov. 

The creation of a textbook on theoretical physics sufficiently comprehen- 
sive in content and clear in presentation is a very complex task. The author is 
therefore conscious of the fact that shortcomings and errors will be discover- 
ed and would be grateful to receive an account of them which can be taken 
into consideration in the next edition of the book. 


1962 


* Chapter 13 of the English edition. 


— 


a 





Theoretical Physics: 
Outline of Vols. 1—4 


Volume 1 (for details see p. xv) 


Part 1 Theory of the Electromagnetic Field 


Chapter 1 General theory of the electromagnetic field 

The electrostatic field 

The quasistationary magnetic field 

The electromagnetic field of arbitrarily moving charges 

Radiation theory 

Electromagnetic field in a vacuum and electromagnetic wave 
scattering 

7 The motion of particles in electromagnetic fields 


DAunhwnv 


Part II Theory of Relativity 


Chapter 1 General principles of the theory of relativity 
2 Relativistic mechanics 
3 Relativistic electrodynamics 


Appendix 1, II and III 


Subject index 


Volume 2 


Part III Statistical Physics 


Chapter 1 The basic concepts of the theory of probability 
2 The kinetic theory of gases 


xi 





SCOMI’DN Pw 


_ 


Part IV 
Chapter 1 


DANnNhWND 


——— ee. 


OUTLINE OF VOLUMES 1—4 


Statistical distribution 

Statistical and phenomenological thermodynamics 

Ideal gases 

Systems of interacting particles 

Crystals 

The theory of fluctuations 

Systems with a variable number of particles 

Statistical distributions in quantum statistics and some of their 


applications 


Electromagnetic Processes in Matter 


Electromagnetic fields in matter 

Electrostatics 

Direct electric current and the magnetic properties of matter 
Quasistationary electromagnetic fields 

High-frequency fields 

Matter in the plasma state 


Appendix IV 


Subject index 


Volume 3 


Part V 
Chapter 1 


WOMINADAWAWH 


11 


Quantum Mechanics 


The basic concepts of quantum mechanics 

The Schrödinger equation 

The mathematical apparatus of quantum mechanics 
Motion in a centrally symmetric field 

The quasi-classical approximation 

The matrix form of quantum mechanics 
Perturbation theory 

Spin and identity of particles 

Applications of quantum mechanics to the consideration of the 
properties of atomic and nuclear systems 

The theory of diatomic molecules 

Scattering theory 





12 
13 
14 
15 





OUTLINE OF VOLUMES 1—4 xiii 


The method of second-quantization and radiation theory 
Relativistic quantum mechanics 

Some problems of quantum electrodynamics 
Fundamentals of the theory ‘of elementary particles 


Subject index 


Volume 4 


Part VI 


Chapter 1 
2 


3 
4 
5 


Quantum Statistics and Physical Kinetics 


Quantum statistics Š 

Physical kinetics 

Kinetic theory of gases and gas-like systems 

Time correlative function method and Onsager’s theory 
Solid-state theory 


Subject index 








Part I 


Contents of Volume 1 


Theory of the electromagnetic field 


Chapter 1 General theory of the electromagnetic field 


§ 1 
2 


DAunpw 


8 
9 
10 
11 
12 
13 


Problems of theoretical physics 

The determination of the vector field from its integral char- 
acteristics 

Charges and particles 

The field of charges at rest 

The equation of continuity 

The electromagnetic field of charges moving with a constant 
velocity 

The electromagnetic field of moving charges. The general 
case 

Maxwell—Lorentz system of equations 

The displacement current 

The electromagnetic field potentials 

Gauge invariance of the potentials 

Energy conservation law of the electromagnetic field 
Momentum conservation law of the electromagnetic field 


Chapter 2 The electrostatic field 


14 
15 
16 
17 
18 


The electrostatic field 

The electrostatic field of a system of point charges 
Quadrupole moment 

Work and energy in an external electrostatic field 


The interaction energy of a system of charges and the elec- 
trostatic field energy 


11 
16 


18 


23 
28 


30 


34 
37 
40 


49 
52 
57 
62 


65 





xvi 


CONTENTS 


Chapter 3 The quasistationary magnetic field 


§19 
20 
21 


22 


The field of a system of charges undergoing a slow quasi- 
stationary motion 

The field of a point charge undergoing a slow uniform mo- 
tion 

The field of a system of charges undergoing quasistationary 
motion at a large distance from the system 

The magnetic moment 


Chapter 4 The electromagnetic field of arbitrarily moving charges 


23 


24 


25 


The electromagnetic field of a system of arbitrarily moving 
charges 

General solution of D’Alembert’s equation in the form of 
retarded potentials 

The field of a point charge moving arbitrarily 


Chapter 5 Radiation theory 


26 
27 


28 
29 
30 
31 
32 


33 


34 
35 
36 


The potentials of the electromagnetic field at a large dis- 
tance from the emitter in the dipole approximation 

The electromagnetic field of dipole radiation at a large dis- 
tance from the emitter 

Dipole radiation of simple systems 

Radiation reaction 

Line width of emitted radiation 

Quadrupole and magnetic dipole radiation 

General case of electromagnetic radiation. The spectral de- 
composition of fields. The radiation zone and induction 
zone. Effect of the proper retardation 


Chapter 6 Electromagnetic field in a vacuum and 


electromagnetic wave scattering 


The propagation of electromagnetic waves at a large distance 
from the emitter 

Plane wave polarization 

Interference and the formation of wave packets 

Scattering of the electromagnetic waves by a free charge and 
by a bound electron 

Absorption of radiation 


_ Canonical form of the field equations 


69 


76 


79 
82 


85 


94 
102 


106 


112 
116 
121 
125 
129 


134 


142 
149 
152 


157 
162 
164 


CONTENTS 


Chapter 7 The motion of particles in electromagnetic fields 


§39 The motion of charged particles in constant electric and 


magnetic fields 


40 The motion of charged particles in slowly varying magnetic 


fields 


41 The Lagrangian and Hamiltonian for a particle moving in an 


electromagnetic field 


42 The motion of a system of two charged particles and the 


radiation from them 


43 The scattering of particles and associated bremsstrahlung 


Part II Theory of relativity 


Chapter 1 General principles of the theory of relativity 


§ 1 The creation and significance of the theory of relativity 
2 Galilean transformations 
3 Attempts to determine an absolute velocity 
4 Postulates of the Einstein theory of relativity 
5 The Lorentz transformation 
6 


Consequences of the Lorentz transformation. Space and 


time intervals 


Einstein’s law of addition of velocities and angular transfor- 


mations 


8 Simultaneity, short-range action and action at a distance 
9 Absolute values in the theory of relativity. Intervals and 


proper time 


10 The invariance of physical laws under Lorentz transforma- 
tions. Four-dimensional formulation of the theory of rela- 


tivity 


ll Four-dimensional vectors and tensors. Four-dimensional 


velocity and acceleration 


Chapter 2 Relativistic mechanics 


12 The dynamical equations of a material point 
13 Momentum, energy and mass in relativistic mechanics 
14 Lagrange’s equations; the Lagrangian and Hamiltonian 





xvii 


174 
182 
187 


189 
196 


215 
216 
219 
222 
224 


228 


234 
237 


238 


242 


248 


256 
260 
267 





xviii CONTENTS 


§15 The mechanics of a system of particles in the theory of rela- 


tivity 268 
16 The energy-momentum conservation law in nuclear physics 274 

t 17 The theory of collisions between relativistic particles. 
Compton effect 286 


Chapter 3 Relativistic electrodynamics 
18 Charge ccaservation, the four-dimensional current and the 


equation of continuity 292 
j 19 The relativistically invariant formulation of the equations of 
the electromagnetic field potentials - 294 
20 The field of a moving charge 296 
21 The electromagnetic field tensor and Maxwell’s equations 304 


22 Some applications: Doppler effect. Méssbauer effect, obser- 
“vation of rapidly moving bodies, transformation of angles, 


1 intensities and cross sections 307 
23 The Lorentz force; the Lagrangian and the Hamiltonian for 

f a particle moving in an electromagnetic field 325 
; 24 The motion of particles in constant electric and magnetic 

i fields 331 

25 A system of weakly interacting charged particles 340 

26 The radiation emitted by a moving charge 350 

AppendixI Vector analysis 356 

Appendix II The Fourier integral 375 

Appendix III The delta-function and its properties 379 


387 


Subject index 





PART I 


THEORY OF THE 
ELECTROMAGNETIC FIELD 








General Theory 


of the Electromagnetic Field 


§ 1. Problems of theoretical physics 


Physics is first of all an experimental science. However, in the work of 
Newton and other founders of contemporary physics, mathematical methods 
were successfully applied to obtain the quantitative formulation of physical 
laws, at first mainly in mechanics. 

In the last century the application of mathematical methods to physics 
was so extensive that there arose the particular branch of physics known as 
theoretical physics. Problems facing theoretical physics are of two kinds: 

1. The expression of physical laws in the form of quantitative relations, 
and the establishment of underlying correlations among experimental facts. 

2.The application of mathematical methods of investigation to find new 
physical laws and the prediction of new, as yet not experimentally observed 
connections between physical phenomena. 

Thus, theoretical physics is, in its methods, a mathematical science, and, in 
its content, a physical science. 

From the above it is clear that it is in theoretical physics that general 
theoretical views concerning the essence of various physical processes are 
embodied and made complete. 

It is best to illustrate this by a simple example. Investigators who estab- 
lished experimentally the planetary atomic model and the ‘presence of dis- 


1 





2 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


crete allowed energy values in atoms and similar facts, gave a paramount 
contribution to theoretical physics. However, theoretical physics could not 
confine itself to the qualitative model representations of the atomic structure. 
From this it follows that theoretical physics seeks to formulate the most 
general quantitative physical laws expressing the essence of as wide a range of 
phenomena as possible. Mechanical laws (Newtonian laws), the laws of the 
electromagnetic field (Maxwell-Lorentz equations), and the laws of quantum 
mechanics etc. serve as examples. 

Logical reasoning cannot be the basis of general. physical laws. Only ex- 
perimental facts can be this. Hence the most general quantitative relations of 
theoretical physics are not ‘derived’, but represent a generalized formufation 
of observed physical regularities. On the other hand, as we shall see in con- 
crete examples, in a number of cases the quantitative expression of physical 
laws appeared as a result of scientific prediction. 

Having at its disposal a quantitative formulation of general physical laws, 
theoretical physics can undertake the second part of its program — the estab- 
lishment of new laws and relations by means of mathematical methods. In 
this way theoretical physics has achieved such great successes that in compar- 
ison with them even such examples of scientific prediction as Leverrier’s dis- 
covery of the planet Neptune in the nineteenth century appear to be of minor 
importance. 

As examples one can point to the discovery of displacement current by 
Maxwell and the consequent establishment of the electromagnetic nature of 
light; the foundation of the theory of relativity by Einstein and, in particular, 
establishment of the mass-energy relation; the prediction by quantum me- 
chanics (founded by de Broglie, Schrodinger and Heisenberg) of the existence 
of the wave properties of microparticles — electrons, protons, etc.; the pre- 
diction by Dirac’s theory of the existence and properties of the positron and 
other antiparticles, and so forth. The role of theoretical physics in the recent 
development of nuclear physics and in the application of atomic energy is 
well-known. 

It is necessary to stress that the methods of calculation of theoretical 
physics are of a special character. Theoretical physics is not a branch of 
mathematics. In theoretical physics one does not try to find the exact physi- 
cal laws defining the behaviour of even relatively simple systems. An exact 
calculation of all possible effects and interrelations would make even the 
most simple problems insoluble. The necessity of taking into account essen- 
tial relations and disregarding unessential ones is borne in mind in every stage 
of investigation in theoretical physics. The relations and equations of theoreti- 
cal physics are so complicated that one must practically always proceed by 





§2 DETERMINATION OF VECTOR FIELD 3 


approximate calculations. In order to find out which approximations are pos- 
sible and advisable and which ones are unjustifiable and physically senseless, 
one must often proceed from available experimental data. At the same time, 
formulae and relations which in principle cannot be checked experimentally 
are not considered at all in theoretical physics. All efforts of theoretical as 
well as experimental physics are directed to explaining objectively existing 
relations, i.e. the physical laws of nature. 

A physical theory explaining known facts but unable to predict new ones 
is always considered unsatisfactory. On the other hand, the highest appraisal 
of the validity of a physical theory is the experimental confirmation of the 
facts predicted by it. In its turn, the elucidation of new phenomena ob- 
served experimentally serves as a stimulus for further development of theo- 
retical physics. Thus, experimental and theoretical physics make a single and 
inseparable whole. 


§2. The determination of the vector field from its integral characteristics 


We shall see in what follows that the state of an electromagnetic field is 
specified by its vector characteristics — i.e. the strengths of the electric and 
magnetic fields. For this reason, in putting forward the general theory and in 
solving definite problems in the theory of the electromagnetic field, extensive 
use is made of a specific mathematical apparatus, the so-called vector analysis. 

A description of the electromagnetic field which is not based on vector 
analysis is possible in principle, but would require very cumbersome calcula- 
tions and complicated transformations. Hence the subsequent exposition is 
carried out solely on the basis of vector analysis. Although we assume that 
the reader is familiar with its fundamentals, Appendix I gives a brief deriva- 
tion of all the formulae and transformations which are to be encountered. 

We shall analyse here an important problem of the mathematical theory of 
the arbitrary vector field. The importance of this problem in the theory of 
the electromagnetic field lies in the fact that the general scheme of calcula- 
tion of the field theory is constructed according to the calculation of the 
arbitrary vector field presented below. 

Let there be a vector field a(r) over all space. Some assumptions, which 
will be mentioned below, will be made about the behaviour of the vector 
a(r) at infinitely distant points of space. 

Assume that at every point of space the integral characteristics of the field, 
the vector flux $ a-dS and the vector circulation $ a- dl, are given. We shall 
see in what follows that for electromagnetic fields it is just these character- 





oe ae 


4 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


istics which contain the quantities which can be directly: measured experi- 
mentally. We shall show that, if these field characteristics are given, then the 
vector field a(r) can itself be found. If $a-dS=/f f(r) dV, where f(r) is a 
known function of coordinates, then on the basis of the Gauss-Ostrogradsky 
theorem 


fa-dS=fv-adv=[f@)av, (2.1) 


and in view of the arbitrary character of the integration range over a volume 
V, we have 


V-a=f(r). (2.2) 


Thus, the definition of the vector flux through a closed surface at every point 
of space is equivalent to the definition of the divergence of the vector. 
Further, on the basis of Stokes’ theorem, 


fa-dl=$(V x a)-dS= fao(r)-48, (2.3) 
where œ (r) is a known vector function of coordinates. Hence 
VX a=a(r). (2.4) 


The definition of the vector circulation is equivalent to the definition of its 


curl. 

We shall show how the vector field a(r) can be found if the divergence and 
curl of the vector a are known over all space. 

We resolve the field a into two fields: a= a, + ap, so that the following 


relations hold: 


V:a; =f(r), (2.5) 
Vxa=0, (2.5') 
V-a,=0, (2.6) 
VX a, =o(r). (2.6') 


The vector field a} is vortex-free; its lines begin and end in the sources and 
sinks whose intensity is given by the function f(r). The vector field ay has no 
sources and no sinks, and is a solenoidal field. 

Let us begin with the consideration of the field a,. Since the field a, is 


— S 





§2 DETERMINATION OF VECTOR FIELD 5 


vortex-free, the vector a, can be written in the form of the gradient of a cer- 
tain auxiliary scalar function 


a,= Vo(r), (2.7) 


where y(r) is a function called the scalar potential. Substituting eq. (2.7) into 
(2.5), we find V -V y = f(r), or 


V’y=fr), (2.8) 
where 


ee Ae 
BD pay Ce ar er 


ax? dy2 dz2 


is the Laplacian. 

Eq. (2.8) represents a second order partial differential equation, called 
Poisson’s equation. We shall obtain its general solution in § 24. Here we shall 
give only the final result and convince ourselves of the fact that this solution 
satisfies eq. (2.8). It turns out that the solution of Poisson’s equation has the 
form 


RTE avon 
w)=— a J Ir—rol 


om WES F£(%0;¥ 0320) dxo dy odzo (2.9) 
an” Vx x9)? +)? + E-z 


where x,y,z are the coordinates of the point of observation, i.e. of that point 


at which the value of the function ọ is sought, while xg, Yọ, Zọ are the vari- 
ables of integration. The quantity 


Ir—rgl=V(x—x9)? +O- +(z-z,)* 











represents the distance from the point rg to the point of observation r. Sub- 
stituting (2.9) into (2.8) one can easily verify that the expression given indeed 
satisfies the initial equation: 
























6 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


£&0; Yo» 20) d*o dyodzo _ 
eee) 
Wace -Y' xs Ir—rol 





1 


1 i) 
-mJ 1E0070) MoV lr=rol 


Éi = [fæ z) —Xg)d0 —Yo)5E Er Zo) dxo dyo dzo = 
=fœ,y,z). 


Here we have made use of the properties of the 5-function and the function 
1/lr—rg| (see Appendix III). The operator V? denotes differentiation with 
respect to the coordinates r and can be brought under the integration sign 
with respect to the coordinates ro. 

If the integration is carried out over all space, it is necessary for the 
existence and convergence of the integral (2.9) that the integrand f (ro) should 
satisfy the obvious requirement 


Ifltg) r8 ><A a My PE, (2.10) 


where A is a finite quantity and à > 0. In ator, words, the function f(ro) 
must decrease more rapidly than the function 1/rĝ aş ro > ce. When this con- 
dition is fulfilled the integral (2.9) converges, and the function y(r) decreases 
as its argument increases indefinitely according to the law 


ly(r)|< 1/r. (2.11) 


Assuming the condition (2.10) to be fulfilled, we can state that (2.9) repre- 


sents the solution of eq. (2.8). 
Knowing the function y(r) and making use of the definition (2.7), we find 


S(t) dV, 
a,(r) = Vo(r n 2 sea alc" (2.12) 


We now pass on to the determination of the vector a>(r). The vector a; is 
of a solenoidal character and, consequently, can be written in the form of the 
curl of a certain auxiliary vector A(r): 


a,= VX A(r) (2.13) 


§2 DETERMINATION OF VECTOR FIELD 7 


The vector function A(r) is called the vector potential. From the definition 
(2.13) it is clear that eq. (2.6) is satisfied automatically: 


V-VX A(r) =0. 
To determine the vector potential Acompletely it is necessary in addition 


to give the value of its divergence V - A, which for the present remains arbi- 
trary. We assume 


V-A=0. (2.14) 
Somewhat later we shall verify the fact that the above assumption does not 
restrict the general character of our reasoning. Substituting (2.13) into (2.6’), 
we have 

Vx [V X A@)] = V [Y: A(r)] —V2A(Q) = o(r) . 
Taking into account (2.14), we find 

V7A(r) =- @(r) (2.15) 
or, in the scalar form, 

WeAD EOL), V O wg VAs Or te 


The components of the vector potential satisfy the same equations as the 
scalar potential y. Their solutions read: 


Wx(Fo) a 





Ax Gn (exe tio (2.16) 
(ro) 

E d ' 

A, fale ? (2.16) 

1 w,(TQ) 

z 4n/ Ir—rol (2.16) 


If the functions w,, w, and œw, satisfy the same conditions at infinity as 
those which must be satisfied by the function f(r) in (2.9), then the integrals 
in the expressions (2.16)—(2.16") converge. In this case formulae (2.16)— 





—E 
























8 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


(2.16") determine the vector potential A. Knowing the vector potential A, 
one can find the vector a, by a simple differentiation: 





a,(r) = VX fare avo] : 217 


By differentiation one can also show that the vector potential obtained satis- 
fies the condition (2.14). 

Thus, the vector field a(r) is completely determined if the values of its 
divergence, (rg) and curl, œ(rọ) in all space are given: 


a(r)=a, +a, =VyotVX A= 


-=--> fo) avy] += v| A Loz (2.18) 


4n lr—ro! lr— 

Since the divergence and curl of the vector a are uniquely related to the 
flux and circulation of this vector, it can also be stated that the vector field 
a(r) is completely determined by the flux and circulation of this vector. 

We shall now dwell on the important problem of the possibility of choos- 
ing V -A in the form of (2.14). The vector potential A is not given uniquely 
by the definition (2.13). One can add to it the gradient of an arbitrary func- 
tion W, i.e. it can be assumed that 








A'=A+Vy. $ ; (2.19) 
We have, obviously, 
VX A’=VXA. 


Thus, the addition of the gradient of an arbitrary function y to A leads to 
the previous value of the vector a. 

Let, contrary to the condition (2.14), V- A+ 0. Then it is always possible 
to transform to a new vector potential A’ according to formula (2.19). For it 
we have 


V-A'=V-At+V-VW=V-AtV2y. 


It is always possible, without restricting the general character of the 
reasoning, to choose the arbitrary function y in such a way that for any 
V-A#0 the equality 





§3 CHARGES AND PARTICLES 9 
V7y=-V-A 

will hold. This means that it can always be assumed that 
v-A’=0 


and, consequently, that condition (2.14) is of a general character. 

Finally, it is easy to show that the expression obtained for a is the only 
solution of eqs. (2.2)—(2.4) *. 

The expression which we have found for the vector field a, depending on 
the values of its divergence, f(r), and curl, w(r), is not connected with any 
assumptions about the physical meaning and character of the quantities under 
consideration. At the same time, it represents the prototype of those calcula- 
tions which one usually has to make in the theory of the electromagnetic 
field in order to find the electric and magnetic fields. 


§3. Charges and particles 


According to contemporary ideas, there are in nature elementary particles 
and systems which have a complex stricture made up of elementary particles, 
e.g. atoms and molecules. Elementary particles and systems consisting of a 
relatively small number of elementary particles — individual atoms and mole- 
cules — are called microparticles and microsystems. 

At present a large number of elementary particles, exceeding 20, is known. 
The interrelations between elementary particles are very far from the simple 
scheme which was adopted in physics a relatively short time ago, when only 
two elementary particles, the proton and the electron, were known. 

We shall acquaint ourselves with the basic properties of microparticles and 
microsystems in what follows, and mainly in Part V of the book. The most 
profound problems concerning the structure and properties of elementary 
particles are not as yet elucidated in contemporary physics, and a number of 
established principles are so complex that we cannot present them within the 
framework of this book. 


* See, for example, N. E. Kochin, Vektornoe ischislenie i nachala tenzornovo ischis- 
leniya (Vector calculus and the principles of tensor calculus) (Izdatelstvo AN SSSR, 
1965) p. 213; H. Lass, Vector and tensor analysis (McGraw-Hill, New York, 1950) 
p. 119. 











10 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 





In Part I of the book some properties of microparticles and microsystems 
will be considered in the approximation of classical physics. What this con- 
sists of and the limits of its applicability will be seen from what follows. 

The most important characteristic of all microparticles is the law of inter- 
action between them. Microparticles can interact when they are at a distance 
from each other. At present it is known that there are several different kinds 
of interaction between microparticles: the electromagnetic, gravitational and 
nuclear interactions. Each kind of interaction is associated with a definite 
characteristic of the particle. In Part I of this book we shall be interested 
solely in the electromagnetic interaction. This is understood to be a well 
defined interaction force between certain particles, the character of which 
will be considered below. The electromagnetic interaction does not depend 
on the masses of the particles, which determine their gravitational interaction. 
It turns out that in the simplest case of identical particles at rest with respect 
to each other the force of interaction is determined only by the distance be- 
tween them and the unique characteristic of the particles called their charge. 

The law of interaction between particles at rest with respect to each other 
is expressed by the well-known formula 


S12) 
= ai 
where F is the force, r is the distance between the particles, and the e; are 
their charges. This is the so-called Coulomb law. 

The charge of a particle of given type is one of its fundamental character- 
istics. The force of interaction between particles of a given type — electrons, 
protons and so on — is always a force of repulsion between them. The inter- 
action between particles of different types can have the character of repulsion 
as well as attraction. The convention is adopted that electrons are negatively 
charged, and that protons are positively charged. The sign of the charge of 
other charged elementary particles — muons, pions, kaons and hyperons — is 
determined with respect to electrons and protons in the following way: par- 
ticles having a charge of the same sign repel each other, whereas oppositely 
charged particles attract each other. 

Neutrons, neutrinos and neutral m-mesons may serve as examples of neutral 
particles. No charge differing from zero can be ascribed to any neutral par- 
ticle. 

A striking feature of the charge is the fact that it has one and the same 
absolute value for all elementary particles. In the CGSE system, which we 
shall use in what follows, the elementary charge is equal to 





§4 FIELD OF CHARGES AT REST 11 
lel = 4.77xX 10710 gil? cm?/2 sec! A 


Another property of the charge, expressing its fundamental importance as 
a characteristic of particles, is that it is conserved. In all processes occurring in 
nature the algebraic sum of charges does not change (the charge conservation 
law). The charge conservation law is one of the most important laws of na- 
ture. 

Most bodies in terrestrial conditions are made of atoms and molecules — 
quasi-neutral systems for which the positive charge of the nucleus is equal to 
the negative charge of the electron shell. When an atom is ionized, i.e. when it 
becomes charged, it loses one or more electrons. Because of the charge con- 
servation law, on ionization there arises a positive ion with a charge Nlel and 
N electrons each with a charge — lel, where N is an integer. When an extra 
electron is added to an atom, the latter may be transformed into a negative 
ion with the charge —lel. Thus the charge of any system is an integer mul- 
tiple of the elementary charge e. 

In the microscopic classical field theory we shall study the behaviour of 
systems consisting of a relatively small number of particles, for example, 
individual electrons or protons, ions and so on. We shall assume individual 
elementary particles to have no extension and to move according to the laws 
of classical mechanics. We shall not be interested in the internal structure of 
elementary particles. As will be clear from what follows, such an idealization 
appears to be too crude in a number of cases. The laws of classical physics 
have a restricted applicability to microsystems, and sometimes are not appli- 
cable to them at all. In particular, they are unsuitable for the consideration of 
phenomena taking place in a very small region of space near the charge. 
Hence in a number of cases, which will be considered later, our simplifying 
assumptions will lead to difficulties and contradictions. 

In quantum mechanics (Part V) the concepts of the laws determining the 
motion and properties of microscopic particles will essentially be developed 
and improved. 

Since for the present we shall be interested only in the properties of 
particles associated. with their electromagnetic interaction, we shall simply 
speak about the interaction of charges. 


§4: The field of charges at rest 


Let there be fixed charges e; at certain points of space r;. In the region of 
space near this set of charges we put a charge e so small that ‘the change in the 








12 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


properties of the system caused by e can be disregarded. We shall call such a 
charge the test charge. Observing the test charge, we shall find out that at 
each point of space r it is acted upon by a force F proportional to the value 
of the charge e, i.e. 


F= eE(r). (4.1) 


Strictly speaking, a force F acts on the test charge located at any distance 
from the fixed charges. However, since the value of the force decreases (see 
below) rapidly with increasing distance, the action of the force F is mani- 
fested practically only in the vicinity of the charges. That region of space in 
which a force F acts on the test charge will be called by us the region of the 
electric field of charges at rest or the electrostatic field. Sufficiently distant 
regions of space in which the force F becomes negligibly small will be as- 
sumed, approximately, to be infinitely distant, and it will be considered that 
there is no field in them. 

Since the test charge does not affect the properties of the field of the sys- 
tem of charges, a vector E characterizes the properties of this field. We shall 
henceforth call E the electric field strength or, briefly, the electric field. 

Investigating the force F acting on the test charge one can determine the 
value of the vector E at every point of the field and thus establish the proper- 
ties of the electric field of the fixed charges. It turns out that it is possible to 
find certain general properties of the fields of fixed charges which do not 
depend on the exact nature of the disposition or the values of the charges. 

Experiment shows that the electric field of a system of charges at rest 
possesses additive properties: the strength of the total electric field produced 
by several charges is equal to the vector sum of the fields E; produced by 
each charge, i.e. 


E=2JE,. (4.2) 
This most important property of electric fields is usually called the property 
of superposition. 


Investigating the motion of the test charge one can find the vector lines of 
the electrostatic field E. 

Knowing the distribution of the field E in space, one can determine the 
distribution ‘and interaction of the charges producing it. The introduction of 
a field as a mathematical method may seem a natural convenient way of de- 
scribing an interaction. We shall see below that in fact this is not so and that 
the field is as real as the particles. One. can ascribe to the field the same 





§4 FIELD OF CHARGES AT REST. 13 


characteristics as to particles — energy, momentum, mass and so on. More- 
over, we shall show that spatially separated particles cannot act directly on 
each other (the so-called long-range action). The particle changes the state of 
field in the immediate vicinity of itself. This change in the state of field, a 
perturbation, is propagated from point to point and reaches the other par- 
ticle. Such is the concept of the field action, or the theory of short-range 
action. The theory of short-range action will be considered in more detail in 
§ 24 of Part I and §8 of Part II. 

It turns out that the work performed by the electrostatic field on the test 
charge as the latter is displaced from a point r} to a point r, does not depend 
on the path over which this displacement takes place. This means that the 
work of displacement of the charge over a closed path is equal to zero, i.e. 


W =e fE-dl=0. 
Thus, for the lines of the electrostatic field the equality 
fE-dl=0 (4.3) 


always holds in integration over an arbitrary closed contour. 

In connection with what was said earlier we pass over from the equality 
(4.3) to the differential characteristic of the field. For this we make use of 
Stokes’ theorem, which gives 


fE-dl= [(v x E)-dS=0. (4.4) 
Since the surface of integration in (4.4) is arbitrary, it follows from (4.4) that 
VX E=0. (4.5) 


This formula shows that the electrostatic field is vortex-free. In it there are 
no closed field lines. Consequently, there must exist sources and sinks in 
which the lines of the field begin and end. 

Experiment shows that in the electrostatic field the sources and sinks of 
„the field lines are electric charges. It is assumed that the lines of the field 
originate on positive charges and end on negative ones. Since charges are the 
sources and sinks, the flux of the vector Eacross any closed surface surround- 


ing each charge is different from zero. If a certain amount of charge XA ei 


where the summation denotes algebraic summation over all the charges, lies 
within a surface of integration S, then the flux of the vector E must be pro- 





14 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


portional to this sum, i.e. 
fE-dS= const 2} OF (4.6) 


This statement is called Gauss’ theorem. 

We stress that within the framework of our approximation the charges 
have a point character, and the value of the charge of any system is a multiple 
of the elementary charge. 

To simplify the mathematical operations we take an important step and to 
pass over from the discrete, discontinuous distribution of charges to the con- 
tinuous one. 

Let there be in a certain small volume ôV a sufficiently large number of 
charges. Since the charges are located at small distances from each other it is 
convenient for the mathematical description of them to replace the true dis- 
tribution of the discrete point charges by a fictitious continuous distribution. 
Namely, replacing the volume 6 V by an infinitely small volume dV and assum- 
ing that an infinitely small charge de is confined in the infinitely small vol- 
ume, one can write that 


de=pdV, 


where p = de/dV is the charge density, i.e. the ratio of the charge to the vol- 
ume it occupies at the given point of space. For charges at rest, p is a con- 
tinuous function of the position p(r). 

It should be emphasized that the transition to the continuous distribution 
is of a purely mathematical character. It should not be confused with an 
analogous operation with which we shall become acquainted in Part IV where 
the electromagnetic processes in matter will be studied. 

The connection between the mathematical description of the discrete dis- 
tribution of point charges and the continuous function p(r) can be established 
by means of the 5-function (see Appendix III). Namely, since the total charge 
in an arbitrary volume can be expressed in the form 


e= fo(r)dV= De, 


where the summation is carried out over all the charges which are in the vol- 
ume V, we can write 


p(r) = >) e,5(r—-4;) , 








FIELD OF CHARGES AT REST 15 





yer er, is the radius vector of the ith charge. Indeed, substituting this value 
f We have 


So) av= Z e; foe—1) av = Dye, 


In particular, the charge density corresponding to one charge located at the 
point rg can be written in the form 


p(r) = ed(r—ro). 


The importance of the introduction of the continuous charge density lies 
in the fact that by its introduction the field itself as well as the charge distri- 
bution are described by continuous functions of position. 


Making use of the definition of charge density, we can write Gauss’ theo- 
rem in the form 


$ E: dS= const [p dV, (4.7) 


where the integral on the right is taken over the volume confined by the sur- 
face S. According to the Gauss-Ostrogradsky theorem, 


§E-dS=[Vv-Eav. (4.8) 
Hence from formulae (4.7) and (4.8) we obtain 
V -E=constp. (4.9) 


Formula (4.9) determines the field divergence at every point of space. The 
value of the factor of proportionality in formula (4.9) can be determined 
only from experimental data (for example, from the Coulomb law). 


In the CGSE system, which we shall use, this constant is equal to 47, so 
that 


V-E=47p. (4.10) 


For reasons which will be explained below, we shall call eqs. (4.5) and 
(4.10) Maxwell’s equations for the electrostatic field or, briefly, the equations 
of electrostatics. 


Since the electrostatic field is vortex-free, in correspondence with the gen- 


16 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


eral methods of describing the vector field one can introduce a scalar y called 
the electrostatic potential and determined by the relation 


~ 


E=—-Vy. (4.11) 


The minus sign means that the vector E is oriented in the direction of the 
most rapid decrease of the potential y. The choice of such a direction is 


arbitrary. 
The quantity 
2 
ff Bais || (eee (4.12) 
1 1 


called the electromotive force or, briefly, e.m.f., will occur often in what 
follows. (It should be noted that the electromotive force is not a force either 
in its nature or its dimensionality. The term e.m.f. has the vindication of 
historical tradition.) In the electrostatic field the electromotive force is equal 
to the difference between the electrostatic potentials at corresponding points. 

Substituting the definition (4.11) into the equations of the electrostatic 
field (4.5) and (4.10), we see that (4.5) is satisfied identically, while (4.10) 


gives 
V-Vy=—4np, 

or, using formula (I. 49), 
V?y=— 4p. (4.13) 


This equation, called Poisson’s equation, will be discussed in §14. 


§5. The equation of continuity 


Later on we shall have to pass over to the consideration of the more com- 
plicated case of the fields of moving charges. The motion of electric charges 
in space leads to the charge transport called electric current or, briefly, 
current. We characterize electric current by the current density vector j (r, t), 
defined by the equality 


S= 27 &¥i > 








§5 EQUATION OF CONTINUITY 17 


where e; is the value of the charge, and v; is the velocity vector of the ith 
charge. The summation is carried out over all charges present at time ¢ in a 
unit volume surrounding the point r. 


In the case of a continuous charge distribution the current density can be 
written in the form 


j=pv. (5.1) 


The current density vector obviously represents the value of the charge 
perpendicular to the velocity per second crossing an imaginary unit area at 
the point r at time ż. 

The values of the functions p and v, i.e. charge density and the velocity of 
displacement of charge cannot be arbitrary but must satisfy the requirements 
of the charge conservation law. 

Consider a closed surface, inside which there is a certain charge e = f p dV, 
and find the derivative 


-2 foar. 


Here the integration is carried out over the volume V enclosed within the 
surface S. The value of the derivative (taken with the minus sign) represents 
the decrease per unit time of the charge inside the surface S. Since electric 
charges do not vanish, and do not arise spontaneously, the rate of decrease of 
the charge in the volume V is equal to the flow of charge per second, out of 
the surface S enclosing this volume. Consequently, the equality 


-2 foav=fov-as (5.2) 


holds. Passing over, in the last integral, to integration over the volume, we 
obtain 


-2 foav=[¥-(v)av 


On changing the order of the independent operations of integration and dif- 
ferentiation with respect to volume and time respectively, we have 


- (#8 av= fv (pv) dV 








18 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


Because the integration volume is arbitrary, this equality gives 


dp : a 
a (py) =0, 


or (5.3) 
dP yy j= 
Py j=0. 


Formula (5.3), representing the mathematical expression of the charge con- 
servation law, is called the equation of continuity. 

For steady state processes, when the charge density distribution is not 
changing in time, the equation of continuity reads 


or (5.4) 


V-(py)=0, 
V:j=0. | 


Equations (5.4) show that, for steady state processes, the current density 
vector is of solenoidal character. The trajectories of moving charges are 
closed, and the lines representing the vector j form closed, non-intersecting, 


current tubes *. 
Hence we shall make use of the notion of the total current J through a 


surface S. By definition 


1=fj-dS= fj, ds, 


where the integration is carried out with respect to the surface S. The current 
J gives the value of the total charge passing per second through the surface S. 


§6. The electromagnetic field of charges moving with a constant velocity 
We now go on to the study of the field of moving charges. We shall call this 


field the electromagnetic field. The properties of the electromagnetic field 
are essentially more complex than those of the electrostatic field. The estab- 


lishment of the basic rules determining the behaviour of electromagnetic 


fields was in part the result of an experimental investigation of electromag- 


* In the particular case of a system.of spreading charges, current tubes are not closed, 
but go off at one end to infinity. 





§6 CHARGES MOVING WITH CONSTANT VELOCITY 19 


netic phenomena (Oersted, Ampére, Ohm and Faraday) and in part the result 
of a theoretical prediction (Maxwell) which was only later on confirmed 
experimentally (Hertz). 

The exposition of the history of the development of electromagnetism is 
outside the scope of this book. However, it should be stressed that, since the 
atomic character of charge was discovered only in exp2riments at the end of 
the 19th and the beginning of the 20th century, all previous experiments and 
their theoretical interpretation referred to phenomena in material media. We 
shall present the results of these experiments in the language of microscopic 
physics which deals with charges moving in vacuum. In other words, without 
dwelling on the set-up of the experiments themselves, we shall present their 
results in a general form, in which the effect of the medium and actual con- 
ditions of the performance of the experiments are excluded. 

The basic laws of the electromagnetic fjeld, which will be set out below, 
at present rest not only on numerous and various experimental data but eon- 
stitute the basis of contemporary electrical and radio technology. 

We shall consider first of all the motion with a constant velocity of a set of 
charges along a tube ora line L (fig. I.1). In other words, we assume that an 
electric current, whose density j does not depend on time, flows along the 





Fig. I.1 





ae m 


l 
i 


20 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


line. In practice an electric current can be realized most simply in metal con- 
ductors. However, since in this chapter we shall not deal with the motion of 
charges in material media, the current line will be understood to be not a 
metal conductor but a certain imaginary surface — a tube enclosing a set of 
curves — the trajectories of charged particles moving in vacuum. 

We put a test charge in the vicinity of the path of the constant current. We 
are not interested in the effect of the charges moving along the path on the 
test charge e when at rest. If the test charge moves with respect to the current 
line with velocity v, new facts are observed. It turns out that a moving test 
charge allows one to detect a field which is inseparably associated with mov- 
ing charges in the path of the current and whose character differs from that 
of the electrostatic field. This field is called the magnetic field. This term is 
connected with the fact that a similar field is produced by permanent mag- 
nets. 

The magnetic field, like the electrostatic field, has a vector character. It is 
characterized by a certain vector H, called the magnetic field strength or, 
briefly, the magnetic field. 

Experiment shows that the test charge is acted upon by a force 


En NPG (6.1) 


This force is called the Lorentz force *. : 

As is seen from formula (6.1), the Lorentz force is perpendicular to the 
test charge velocity v and to the vector H, and forms with them a right-hand 
screwed system. 

The numerical factor of proportionality is determined experimentally, if it 
is required that the vector H should have the same dimensions as the vector E. 
In the CGSE system the numerical factor c is equal to 3X10!9 cm sec—!, and 
is a universal constant numerically the same value as the velocity of light in 
vacuum. 

As for any vector tield, we introduce the basic characteristic of the mag- 
netic field, the integral $ H- dl, called the magnetomotive force **. 

A study of the magnetic fields of direct currents has shown that the value 
of the magnetomotive force is equal to 


* We shall often call the Lorentz force the total force acting on a charged particle in 
an electric and magnetic field, equal to the sum of the forces caused by both fields. 

** By analogy with the electromotive force in electrostatics. It should be stressed 
that the magnetomotive force, like the electromotive one, is a scalar and is called a force 
only by tradition. 


—— 





§6 CHARGES MOVING WITH CONSTANT VELOCITY 21 
4nI 
faat, (6.2) 


where J = f j-dSp represents the total current per second passing through the 
cross section Sg of the contour (tube) of the current. 

Formula (6.2) shows that the magnetomotive force differs from zero only 
over a path which encloses the tube of current. In the simplest case of a recti- 
linear current (or a rectilinear segment of current), the vector lines of the 
magnetic field form a system of concentric circles about the current line 
(fig. 1.2). The value of the magnetomotive force is proportional to the total 
current / in the line. Formula (6.2) expresses the law of Oersted, who estab- 
lished in the years 1820—1826 the connection between the electric current 
and magnetic phenomena. 


Fig. 1.2 


As in electrostatics, experiment leads to an integral relation between the 
characteristics of the charge (the current /) and the field H. In order to obtain 
a differential relation characterizing the field, we replace the integration sur- 
face Sọ in the expression for the total current by an arbitrary surface S drawn 
through the path of the current (fig. I.1). The current density j differs from 
zero only when the integration is carried out with respect to the cross section 
So of the tube of current. Outside this cross section the current density is 
equal to zero, so that it can be written that 


I= [j-dSy= fj-as. 
We then have 


faat fjas. 








ee 


22 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


Making use of Stokes’ theorem, we find 
4f, 
J (x H)- aS =7fj-as, 
whence, in view of the arbitrariness of the surface S, it follows that 
vxH=%j. (6.3) 


We see that the magnetic field has vortex character. Eq. (6.3) determines the 
curl of the magnetic field at every point of space as a function of the value of 


the current density j at every point of space. 

Eq. (6.3) agrees with the stationary equation of continuity (5.4) and, 
hence, is self-consistent. Indeed, calculating the divergence of the two sides 
of (6.3) we arrive at the equality 


V-(VXH)=““y-j=0. (6.4) 


Formula (6.3) shows that the motion of electric charges is inseparably 
associated with a magnetic field or, less rigorously, the motion of charges 


gives rise to a magnetic field. 

In order to determine a magnetic field uniquely it is necessary to know 
the second differential relation — V - H. To find it we shall consider the flux 
of the vector H through an arbitrary closed surface $ H- dS. The experimental 
study of the distribution of the magnetic fields of direct currents shows that 
the magnetic fields are always of a purely solenoidal character, and 


fH-dS=0. (6.5) 
Consequently, 

fu-dS={V-Hav=0 
or, because of the arbitrariness of the volume of integration, 


vV-H=0. (6.6) 


§7 CHARGES MOVING WITH VARIABLE VELOCITY 23 


Thus, the magnetic field has neither sources nor sinks. The lines of the mag- 
netic field are always closed or go off to infinity *. 

Eqs. (6.3) and (6.6) determine completely the magnetic field of direct 
currents. The magnetic field of direct currents possesses additive properties, 
as does the electrostatic field. This follows, in particular, from the linear 
character of the field eqs. (6.3) end (6.6). The magnetic and electrostatic 
fields are independent of each other. An electrostatic field (for a given dis- 
tribution of moving charges) has no effect on the magnetic field of the 
charges. 

In conclusion we note one more very important feature of the magnetic 
field H. In contrast to the electrostatic field E, which is characterized by a 
polar vector, the vector of the strength of the magnetic field H is an axial 
vector or a pseudo-vector. This is seen from the formula for the Lorentz 
force (6.1). Indeed, from the definition of the field strength by formula (6.1) 
it is seen that the behaviour of the vector H in the reflection r> (—r) through 
the origin is determined by the behaviour of the polar vectors F and v and 
the properties of their vector product. In substituting r > (—r) the directions 
of the vectors F and v become reversed, and the sign of the vector product is 
changed. Consequently, in substituting r>(—r) the vector H must remain 
unchanged. It is just this property that is an indication of an axial vector. 


§7. The electromagnetic field of moving charges. The general case 


In considering the non-steady state motion of charges or, what is the same, 
the non-steady currents in certain contours, further results are obtained. 

It should be stressed, first of all, that eq. (6.3) cannot remain valid for 
non-steady currents. As we have just seen, eq. (6.3) leads to the charge con- 
servation law in the form of (5.4). However, for non-steady state processes 
the law is expressed by formula (5.3). Thus, for these processes the relation 
(6.3) contradicts the charge conservation law. 

The most important fact which in essence distinguishes non-stationary 
magnetic and electric fields from stationary ones is the existence of an inter- 
relation between them. 

Faraday observed that a chenge of a magnetic field in time entails the 
appearance of an electric field (the phenomenon of electromagnetic induc- 


* Strictly speaking, for complex configurations of currents there may also exist lines 
of field which are not closed but fill a surface densely. See: 1. E. Tamm, Osnovy teorii 
elektrichestva (Introduction to the theory of electricity) (Gostekhizdat, 1954) p. 53. 


——— 





24 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


tion). Maxwell predicted theoretically that a change of an electric field in 
time leads to the appearance of a magnetic field. This theoretical prediction 
subsequently found confirmation in Hertz’ experiments. 

Faraday’s experiments established that a change in time of the flux of a 
magnetic field vector through an arbitrary surface S is accompanied by the 
appearance of an electromotive force in the contour enclosing this surface, 


i.e. 
fE-d=-+2 fuas. ED 


The factor of proportionality c turned out to be numerically equal to 
3X1019cmsec—!, i.e. to the velocity of light in vacuum. Fig. 3 shows schema- 
tically the relation between the change in the magnetic field 0H/dr and the 
electromotive force. If the lines of the field vector 0H/d¢ are represented by 
straight lines going from the left to the right, then the lines of the electric 
field are represented by concentric circles embracing the corresponding 
straight lines. The directions of the vectors of the electric field E and 3H/ðt 
are shown in fig. I.3 by arrows. 


E H 
I tad 
ot ot 
ad 
> Se 
Fig. 1.3 Fig. 1.4 


Formula (7.1) represents the generalized Faraday law of induction (1831). 
The generalization consists in the following. The experimental data of 
Faraday referred to a circuit of a metal wire. The appearance of an induced 
electric field in a conductor corresponds to the appearance of a current in 
the circuit. It was this current that was measured directly. In formula (7.1) 
the integration is carried out over a completely arbitrary contour independent 
of the presence of conductors. This means that the primary factor is the ap- 
pearance of the field in the contour. The electric current is some secondary 
phenomenon associated with the material nature of the contour — the pres- 
ence of a conductivity in it. The generalized Faraday law of induction in the 





§7 CHARGES MOVING WITH VARIABLE VELOCITY 25 


form (7.1) establishes the interrelation between the magnetic dnd electric 
field. The sign in formula (7.1) corresponds to the well-known Lenz rule. 

Maxwell put forward the hypothesis, subsequently confirmed experimen- 
tally, that in addition to the law (7.1) there is an analogous relation between 
the change of an electric field in time and the magnetic field. 


Namely, when the flux of an electric field through a surface S changes in 
time, 


2 fE-as, 


there arises in the contour embracing the surface S a magnetomotive force 
equal to 


fu-al-+ 2 [E-as, (7.2) 


where the coefficient c has the same value as in formula (7.1). Fig. 1.4 illu- 
strates the relation between the change of the electric field, characterized by 
the vector 0E/dr, and the lines of the magnetic field. If the vector 3E/vt at 
every point of the field is represented by a family of straight lines, then the 
lines of the magnetic field are represented by concentric circles about these 
straight lines. 7 

We shall postpone till §9 the discussion of the tonsiderations which lead 
to the establishment of the relation between the change of the electric field 
and the magnetic field circulatı »n. Asto the connection of the law (7.2) with 
experimental data, it turns out that the existence of electromagnetic waves is 
associated with (7.2). If the line L, along which the magnetic field circulation 
is calcula-ed, links the contour defining the electric current, ther the total 
magnetomotive force is expressed by the formula (see (6.2) and (7.2)) 


cha (hades O a 
fu-al=“" (j-as+* 2 fE-as. (73) 


From formulae (7.1) and (7.3). containing t }e integral characteristics of the 
fields E and H, one can pass over to differential characteristics. By means of 
Stokes’ theorem formula (7.1) can be rewritten in the form 


f ON O OE 
[ (VX E) -dS=-— 5; [H-dS=-=J gr dS- (7.4) 


26 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


We have changed the order of carrying out the independent operations of 
differentiation with respect to time and integration with respect to a surface 
fixed in space (see, however, Part IV, §23). 

In view of the arbitrariness of the integration surface in formula (7.4), 
there follows from the equality of the integrals the equality of the integrands: 


__1 0H 
Vixen TE (7.5) 


We see that a time varying electric field is, in general, rotational, in contrast 
to the electrostatic field. The value of the curl of the vector Eat a given point 
is determined by the rate of change in time of the magnetic field at the same 
point. 

Transforming the integral relation (7.3) in a completely analogous way, we 
have 


4n.,1 0E = 
This relation shows that the curl of the magnetic field at every point of space 
is determined by the current density of the charges j and the change of the 
electric field in time. 

The symmetry of the equations for the curls of the electric and magnetic 
fields (7.5) and (7.6) is striking. A change in the magnetic field is accom- 
panied by the appearance of a rotational electric field. Conversely, a change 
in the electric field is associated with the appearance of a rotational magnetic 
field. However, besides the similarity between these equations there is also an 
extremely important difference between them. First, eqs. (7.5) and (7.6) 
differ from each other by the sign before the time derivatives. Second, in 
general the curl of the magnetic field depends not only on the rate of change 
of the electric field but also on the current density j. However, there are no 
‘magnetic charges’ which can move in space and produce a ‘current of mag- 
netic charges’. 

Eqs. (7.3) and (7.5) determine the curls of the electric and magnetic fields. 
For a single-valued determination of the fields there must also be given the 
divergences of the electric and magnetic fields. t 

In order to determine these we shall find the divergences of both sides of 
eqs. (7.5) and (7.6). We start'from the latter equation. We have 


v: (VX H) == Sty jety E -tty.j+1(2y-g) 


c ot O 





§7 CHARGES MOVING WITH VARIABLE VELOCITY 27 


By means of the charge conservation law (5.3) the above equation can easily 
be transformed into 


a È 
an. E-4np)=0, 


whence 


V-E=4np + f(r), 


where the function f(r) is any function depending only on coordinates but 
not on time. 

Let us assume that at-an arbitrary initial instant of time, the charges were 
at rest. Then at the initial instant the electrostatic field equation (4.10) must 
have been: satisfied, and the function f(r) was equal to zero. Since f(r) does 
not depend on time, this means that f(r) is always equal to zero. The electric 
field divergence for stationary as well as non-stationary fields is determined 
by the formula 


V-E=4np. (7.7) 


Similarly, taking the divergence of (7.5), we find 
=o=y.-|i oH 
V-(VXE)=0=V [2 H | 
or 


V-H=f,(r). 


Assuming again that at the initial moment the magnetic field had a stationary 
character, and repeating the reasoning just presented, we arrive at the conclu- 
sion 


v-H=0 (7.8) 


The magnetic field for stationary as well as non-stationary fields has a purely 
rotational (solenoidal) character. Neither in stationary nor in variable fields 
are there magnetic charges on which the lines of the magnetic field begin and 
end. 





28 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


§8. Maxwell-Lorentz system of equations 


The system of eqs. (7.5)—(7.8) which we have obtained for the electro- 
Magnetic field in a vacuum is called Maxwell’s equations. They were estab- 
lished by Maxwell in 1873 for the more general case of electromagnetic fields 
in material media, and by Lorentz in 1895 for a system of charges moving in 
vacuum. 

Let us write once again Maxwell’s equations, grouping them into two 


pairs: 


Differential form Integral form 
== 12H P Dee 
A fE-dl=—~ 5 fH-aS; (8.1) 
Pair I 
v-H=0, fH-as=0 (8.2) 
-19E , 4nj Fae yen E-ds- 
© (YxH- ee fieal= “27+ 2 (Eas; (8.3) 
Pair II 
V-E=479 fE-dS=4re. (8.4) 


Assuming that the distributions of the currents and charges are known, 
one can find by means of Maxwell’s equations the six unknown components 
of the field vectors E and H. As we have seen in the preceding paragraph, the 
equations for the divergence of E and H follow from those for the curl and 
the initial conditions. Hence in the Maxwell system of eight scalar differential 
equations there are only six independent equations. 

Eq. (8.1), representing a generalization of the Faraday induction law, es- 
tablishes that a magnetic field changing with time gives rise to a rotational 
electric field. Eq. (8.2) shows that the magnetic field has a solenoidal charac- 
ter and that the lines of the magnetic field are either closed or go off to 
infinity. 

From eq. (8.3) it follows that a rotational magnetic field is produced when 
charges are moving and when the electric field changes with time. By analogy 
with the electric current, the quantity (1/47) (9E/ðt) in the first term on the 
right is called the displacement current, while the sum of the two terms is 
called the’ total current: It can then be said that the rotational magnetic field 
is produced by the total current into which the two terms enter equivalently. 
Finally, eq. (8.4) shows that the sources of the'electric field are electric 
charges. For given density distribution of charges and currents Maxwell’s 


” 


§8 MAXWELL-LORENTZ SYSTEM OF EQUATIONS 29 


equations determine completely the electric field E(r, t) and the magnetic 
field H(r, £). 

Maxwell’s equations represent a system of linear partial differential equa- 
tions. By virtue of the linearity of the equations, the principle of super- 
position of electromagnetic fields holds. If E; and H; are solutions of the 


Maxwell equations, then E= 25 E; and H= > H; are-also solutions of these 


“equations. 

The integration of partial differential equations becomes*definite only in 
the case where a set of boundary and initial conditions is given. We shall dis- 
cuss boundary and initial conditions later, in §24. 

Up to now we have said nothing about the distribution of the charges and 
their motion in space. However, the distribution of the charges and their 
velocities cannot be given quite arbitrarily. The charge density and the current 
density (the velocities of the charges) are interrelated by the charge conserva- 
tion law: 


i y-jn 
a7 V 0. (8.5) 


Charges moving in an electromagnetic field are acted upon by the Lorentz 
force. The equations of motion of a charge can be written in the form 


ae (E+ Lyx H), (8.6) 


where p is the momentum of the particle. 

The field which must be substituted into (8.6) represents the total field 
including that produced by the charge itself as well as the external field pro- 
duced by other charges. The former must also have an effect on the motion 
of the particle. However, in most cases one can assume the self field to be 
weak and not take it into account (see §29). In this approximation the vec- 
tors E and H in (8.6) stand for the external field acting on the particle. The 
law of motion (8.6) can also be written for continuously distributed charges, 
if Po is understood to be the momentum of the particles in unit volume and 
if the force density (i.e. the force per unit volume) is substituted for the 
Lorentz force. Then 


2 fpoar= fo (E+5vx H)av, (8.7) 


30 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


or 
d 1 
Š po = (Et i vx). (8.8) 


We have already mentioned that field equations were initially formulated by 
Maxwell for electromagnetic processes in matter. Lorentz established their 
applicability to a system consisting of a field and charges, and supplemented 
them by the equation of motion of the charges. 

Hence the eqs. (8.1)—(8.8) are often called the Maxwell-Lorentz equations. 
The Maxwell-Lorentz equations contain the complete description of the be- 
haviour of a system consisting of fields and charges. If the values of the func- 
tions p and v and the initial values of the fields E and H are given, then the 
integration of these equations allows one to find the electric and magnetic 
field distribution in space at any subsequent instant of time. Thus, in electro- 
dynamics, as in mechanics, if the state of a system at the initial moment and 
the law of change of state are given one can determine uniquely the state at a 
subsequent moment. 

It should be noted that, within the framework of this chapter, the Lorentz 
force acting on a moving charge should be assumed as an empirical formula. 
In Chapter II it will be shown that the expression for the Lorentz force fol- 
lows, as a consequence, from more general laws of physics. 

The range of applicability of the Maxwell-Lorentz equations is extremely 
large. They determine the character of electromagnetic processes on the cos- 
mic scale, constitute the basis of contemporary electrical and radio technol- 
ogy, and allow one to investigate electromagnetic phenomena taking place 
with individual charges. But, nevertheless, as we shall see in what follows, the 
Maxwell-Lorentz equations and the classical field theory based on them are 
not an expression of the universal laws of nature, and have a restricted region 
of applicability. 

A number of electromagnetic processes and, above all, intra-atomic ones, 
lie beyond the limits of applicability of the Maxwell-Lorentz equations. The 
problem of establishment of these limits will be discussed more than once 


later on. 


§9. The displacement current 


In contrast with the electric current j, which has a-very simple and cbvious 
meaning, the displacement current (1/47) (0E/0¢) is not associated with the 
motion of any charges. 





§9 THE DISPLACEMENT CURRENT 31 


The displacement current was introduced by Maxwell who interpreted it 
in terms of ether theory, which was generally accepted at the time but which 
has now been abandoned as erroneous. 

It is readily understood why, in the middle of the 19th century, one could 
not have had recourse to experimental data in order to obtain formula (7.6). 
The velocities of motion of electric charges are, as a rule, very large. Hence 
the current pv is always large in comparison with the displacement current 
(1/47) (0E/82), provided the electric field is not changing very rapidly in time. 
Estimates show that the two terms of the total current may be of the same 
order of magnitude for periodic variation of the vector E with a frequency of 
the order of 106—107 sec! or higher. 

This is a range of radio frequencies which was unknown in physics in the 
middle of the 19th century. It was only in 1888 that Hertz was the first to 
establish experimentally the existence of electromagnetic waves, and hence 
prove the reality of the displacement current. 

The fact that the current j cannot always be responsible for the appearance 
of a magnetic field is very clearly illustrated by the following reasoning. Let 
an electric charge e be moving in space. We seek to determine the strength of 
the magnetic field produced by this charge along a certain contour L (fig. 1.5). 
According to Stokes’ theorem, 


fH-dl={ (vx H) -dS= "7 fjas, 


c 


where the integration surface is any surface enclosed by the contour L. 

Let us consider at a certain instant two surfaces S} and S, enclosed by the 
contour L. It is clear that from the standpoint of the electric current passing 
through them they are not equivalent. No current passes through the surface 
Sı at the given instant because the charge e has not yet reached it. On the 


ae ee 
iai 4 
oe f 
5 I 
L 
ev 4 
` \ 
~ 
eS ie 
ae ~ 
Sess 
aS 


32 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


contrary, the surface S, is crossed by the charge and the current through it 
differs from zero. Thus, we arrive at the obvious contradiction: 


f (VXH)-dS,=0 and f (VXH)-aS, #0, 
Sy S2 


although according to Stokes’ theorem these integrals must be equal to each 
other. In reality the moving charge produces at the surface S, a certain 
electric field changing in time. A calculation which will be performed in §20 
shows that the derivative of this electric field with respect to time, integrated 
with respect to the surface S}, gives exactly the same value for $ H- dl in the 
contour L as the integral 47 f j- dS with respect to the surface S. Thus, it is 
necessary to assume equivalence (in the sense of producing circulation of the 
magnetic field in a certain contour) of the charge current and the change in 
the electric field for any moving charge. 

The introduction of displacement current can be made on the basis of the 
following formal reasoning: it is necessary to find a generalization of the law 
(6.3) such that it does not contradict the charge conservation law (5.3) for a 
non-stationary change of the field. 

Writing (6.3) in the form 


vx H= “7 (j+C), (9.1) 


where C is an unknown vector, and taking the divergence of both sides of 
(9.1), we find 


V: (VX H) =0= f7 (V-j+V:-C), 


or 
V-(j+C)=0.- (9.2) 


The unknown vector C supplements the current density j in such a way that 
the quantity (j+ C) possesses the properties of a solenoidal vector having 


closed current lines. 
The divergence of the vector C can be found from eqs. (9.2), (5.3) an 


(4.10). We have 


sie 9 fe LO a a E 
VOV arian. an’ oF? 





§9 THE DISPLACEMENT CURRENT 33 
whence it follows that the vector C is equal to 


1 ðE 


Cer ðt 


+b(r, t), 


where b(r, 7) is any vector satisfying the condition 
V-b(r,=0. 


There are no grounds to assume beforehand that the vector b is equal to zero. 

The brilliant idea of Maxwell consisted in his assumption that a profound 
symmetry and interrelation exist between the electric and magnetic fields. 
We emphasized this symmetry in considering the properties of Maxwell’s 
equations. To obtain symmetry between the fields it is necessary to assume 
b= 0. Then 


13E 
vxH=%j pa (9.3) 


and, in particular, in the space region in which there are no moving charges 
j= 0 and the equation for ¥ X His of the form 


vx H= 138 
ðt 


The curl of the vector H is determined by the rate of change of the vector E, 
whereas, according to Faraday’s law of induction (7.5), the curl of the vector 
E is determined by the rate of change of the vector H. Only in those regions 
of space where the current density differs from zero is the symmetry between 
electric and inagnetic field violated. Since there are no real magnetic charges 
and since the magnetic field is always of purely solenoidal character, then in 
the equation for the electromagnetic induction (7.5) there is no term analo- 
gous to the (4n/c)j of eq: (9.3). On the contrary, if the vector b were differ- 
ent from zero, no symmetry would exist between the electric and magnetic 
fields. 

In conclusion we emphasize that since the displacement current in vacuum 
(1/47) (AE/d2) is not connected with displacement or change of state of any 
particles, it cannot be compared with any mechanical model allowing one to 
picture this physical quantity in a simple way. For the sake of clarity, par- 
ticles at rest or in motion — e.g. small spheres or pellets — were associated 















34 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


with the ideas of “charge density” or “current density”. The field vectors E 
and H were represented by lines of force or field lines, which were previously 
pictured in the form of tensions in an elastic medium — the electromagnetic 
ether. Such mechanical models were undoubtedly useful, because they helped 
one to understand clearly the meaning of the corresponding quantities. Dis- 
placement current in a vacuum is the first quantity we encounter which 
cannot be described by means of a mechanical analogue. In what follows we 
shall have to deal with a very large number of important physical notions and 
quantities having no mechanical analogues. Like the displacement current in 
vacuum, they cannot be represented by any obvious model. 


§10. The electromagnetic field potentials 


In-§2 we have seen that it is convenient to introduce auxiliary quantities 
defined by relations (2.7) and‘(2.13), in order to find the stationary vector 
field from the values of divergence and curl given in each point of space. It 
turns out that one can proceed in exactly the same way in the more general 
case of a non-stationary system of vector fields — electric and magnetic. 

The magnetic field vector is always solenoidal and its divergence is equal 
to zero. Hence, as in §2, we assume that 


H=VXA, (10.1) 
where the auxiliary vector A is called the vector potential. The vector poten- 
tial A(r, t) is a function of space and time. The eq. (8.2) is automatically 
satisfied, since for any A(r, t) 


V-(VX A)=0 


Substitution of the relation (10.1) into (8.1) gives 


vx E=-1yx 94 
or 
or 
130A 
vx (E+ 124) - op 


The last equation shows that the vector 


— 


§10 ELECTROMAGNETIC FIELD POTENTIALS 35 


1 3A 
Ee or 


is a potential vector, i.e. it can be written in the form 


1 dA_ 
Et or Re 


where y is a function of coordinates and time which we shall call the scalar 
potential. 

In contrast with the electrostatic. case, the electric field vector, having a 
vortex nature, cannot be written in the form of the gradient of a potential. It 
is expressed through the totality of scalar and vector potentials by the for- 
mula 

dA 
=—Vy-— Fyne (10.2) 


ale 


where the second term, connecting the electric field with magnetic quantities, 
expresses the law of electromagnetic induction. 

For the determination of A and y we have in addition eqs. (8.3) and (8.4). 
The former equation gives 








> 


2 
1A _1y 9,40, 


VX (VX A)= 
c2? ðt? CHO OG: 


or, according to formula (I. 50), 


2 
V(V-A) —V2A=— — -yT 


Pi Bae geen fie en E) (10.3) 
Eq. (8.4) gives in turn 


Det Sy ORT. 
Vo 4np aay A. (10.4) 





36 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


The potentials A and y are auxiliary quantities, introduced in order to 
simplify the field equations. We’ shall impose upon them those conditions 
which will allow us to make eqs. (10.3) and (10.4) independent without 
changing relations (10.1) and (10.2). Relation (10.1) determines the curl of 
the vector potential A. However, the vector potential A is in itself still un- 
determined, since its divergence is not given. If the divergence of A is given by 
the relation 


v-A+i Læ- 0, (10.5) 


called the Lorentz relation, then (10.3) will turn into 


42 
Za- ONS eae (10.6) 


Eq. (10.4) will assume the completely analogous form 


2 
V% - Faget (10.7) 
c 


The equations obtained for the electromagnetic potentials are equivalent 
to Maxwell’s equations. If the distributions of the charge p(r, t) and the 
currents j(r, f) (satisfying the continuity equation) are given, then integration 
of eqs. (10.6) and (10.7) allows us to find the vector potential and scalar 
potential. The field vectors will be found by differentiating according to 
formulae (10.1) and (10.2). 

An equation of the type 


1 3y 
= V(r, 0), 
c? ar? 


Vv 2 = 
where W(r, t) is a function of coordinates and time, is called D’Alembert’s 
equation. It is often written by means of the so-called D’Alembert differ- 
ential operator (D’Alembertian): 

2 2 2 2 
qa J Oe Oe fee ee 1 ð 


c? ar ax2 ay2 ðz? cc? ar 


in the more compact form: 





§11 GAUGE INVARIANCE OF POTENTIALS 37 
Oy= y(r, t). 


In the particular case of the homogeneous equation for which 4 (r, t) = 0, 
the so-called wave equation, 


is obtained from D’Alembert’s equation. In another particular case, that of 
time-independent functions y(r) and (r), D’Alembert’s equation reduces to 
Poisson’s equation which is already known (see (2.8)) from electrostatics: 


Vp = y(r). 


From the mathematical standpoint the second-order equations — Poisson’s 
equation, the wave equation and D’Alembert’s equation — are simpler than 
Maxwell’s first-order equations in partial derivatives. As we shall see in what 
follows, the general solution of D’Alembert’s equation and the wave equation 
can be obtained in integral form, as for Poisson’s equation (see (2.9)). It is for 
this reason’ that, in investigating the properties of electromagnetic fields, as 
well as in solving a number of concrete problems, the use of potentials is very 
convenient and the potential method represents the basic mathematical ap- 
paratus of the field theory. 


§ 11. Gauge invariance of the potentials 


We have already emphasized that the potentials A and ọ represent auxiliary 
quantities having no direct physical meaning. The field strengths E and H, 
which have a definite value at every point of the field and at any instant, have 
a real meaning. They can be measured by means of the forces acting on a test 
charge moving in the field. 

The values of A and y cannot be measured and, therefore, the potentials 
by themselves should not be involved in the final expressions of the field 
theory. Indeed, determination of the potentials A and y from formulae (10.1) 
and (10.2) is not unique and allows a certain arbitrariness. 

We now have to discuss the question of the degree of arbitrariness in the 
determination of the potentials in the general case. From the definition of the 
vector potential (10.1) it follows that if we perform the transformation 






























38 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 
A> A'+VVG,)9, (11.1) 


where w(r, f) is an arbitrary function of coordinates and time, then we arrive 
at the same value of the field strength H: 


H= VX A= VX (A’'+ VwW(r, 1) =VX A'+ VX VW=VXA'. 


Now consider the determination of the scalar potential y given by formula 
(10.2). The transformation (11.1) leads to the value 


=. LO Loy) _ LA! 
ESAN c ot Lyo y(r CRO 
Substituting 
RO 
poy 1, (11.2) 


we arrive at the previous expression for the electric field strength. 

Thus, the vector potential is determined to within a vector representing 
the gradient of an arbitrary function of coordinates and time y (r, £), while 
the scalar potential is determined to within the derivative of the same func- 
tion with respect to time. In particular, in the case of a time-independent 
electric field, one can add an arbitrary constant to the potential y and arrive 
at the same value of the field strength E= — V(y + const) = — Vy. 

In general it can be said that two fields described by the system of poten- 
tials A’ and y’ and A and yg respectively, are physically identical if A and A’, 
y and y’ can be connected with each other by the relations (11.1) and (11.2). 
The same fact can be expressed by the words: electromagnetic field equations 
are invariant under the transformations (11.1) and (11.2). 

Different ways of choosing the potentials A and y, leaving the field 
strengths E and H unchanged, are called different gauges of the potentials. 
The invariance of the fields E and H and all other relations of the field 
theory with respect to different gauges are called gauge invariance. The prop- 
erty of gauge invariance permits one, by allowing a certain degree of arbitrari- 
ness in choosing electromagnetic potentials, to select them in such a way that 
the relations of the field theory may take the simplest form. 

Lorentz’s condition, introduced in the previous paragraph, serves as an 
example of such a selection. 

We can now show that Lorentz’s condition corresponds to a definite gauge 


§11 GAUGE INVARIANCE OF POTENTIALS 39 


of potentials (Lorentz gauge). Let the Lorentz condition not be fulfilled for 
certain Ag and yọ, so that 


1 ðpo 
Vi Ay ts 9, X0 9 #0. 


We perform the gauge transformation (11.1) and (11.2): 


Ay 7 At Vw, 
ay 
Ose CROP 


Then we have 


2 
Yr, )-v2y+t OY (11.3) 
at c? ðt? 


dL 
c 


If it is required that the function y should satisfy D’Alembert’s equation 


2 
Nee O (11.4) 


then the Lorentz condition will be fulfilled for the transformed potentials A 
and y. We have already mentioned that D’Alembert’s equation always has a 
solution. Hence, for a given value of x it is always possible to select a function 
y satisfying (11.4). 

It should be stressed that the arbitrary function y(r, t) is not determined 
completely by eq. (11.4). One can add to it a function Wo(r, £) representing 
the solution of the homogeneous equation 


2y OANT 
© c2? ar 


Performing the transformations 
A> A'+ VY 


, Lavo 


n TEETE 





40 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


we arrive again at the same values of field strengths E and H: 


H=VXA', 
_ Sto ; 
ES a ay Mo 


Therefore, remaining within the framework of the Lorentz gauge, one can 
choose the function w in such a way that an additional condition, imposed 


upon one of the four values P,Ax Ay, Az, can be fulfilled. 
In addition to the Lorentz gauge use is sometimes made (particularly in 
quantum field theory) of another gauge, the so-called Coulomb gauge, for 


which 
V-A=0. 


In this gauge the equations for potentials (eqs. (10.3) and (10.4)) assume the 
form 


2 
iva LTA es eeu 
c? ət? c at c 
V-p=— 4p, 


and the scalar potential y is determined by the distribution of charges as if 
they were at rest. It goes without saying that the field strengths E and H 
found from the solutions of the equations for the potentials with the Cou- 
lomb gauge and with the Lorentz gauge are the same. In this book we shall, as 
a rule, make use of the Lorentz gauge. 


§12. Energy conservation law of the electromagnetic field 


The first important general consequence following from Maxwell’s equa- 
tions is the existence of the electromagnetic field energy. In order to find this 


§12 ENERGY CONSERVATION LAW 41 


energy we shall consider a closed system consisting of a field and particles. 
Let us find the work W doen by the field forces on the particles in a volume 
V. By considering the rate at which this is done, assuming the charges to be 
continuously distributed in space, and making use of (8.8), we can write that 


aw_r 1 
=e = {F-vav=fp (E+2 vx H) -vav= 
= [oE-vav +2 f px H)-vdV= 
Ç 


= [j-Eav++ fow x v)-Hav = fj-Eav. 
(12.1) 


The work of the magnetic field force is equal to zero, since this force is per- 
pendicular to the velocity of the particle. 

We transform the relation (12.1), making use of one of Maxwell’s equa- 
tions. Expressing the current density in terms of the field vectors by means of 
(8.3), we have 


dw _ TANE. 1 fp. 2E 
a = fi Eav -JE (YX H)av— 7 fe z7 A 


As we have stressed more than once, there must exist a symmetry between 
the electric and magnetic fields. However, eq. (12.2) is asymmetric. We can 
make it.have the symmetric form by adding to its right side the expression 


c 1 dH 
-JH (vx +134) ay, 


which, from the Maxwell eq. (8.1), is equal to zero. This gives 


ou = [j-Eav= 


= £ fE wxm -avenar fè (22) dV 


87 
(12.3) 
The first integral in the right-hand side of eq. (12.3) can be transformed 
into a surface integral. Namely, according to formula (I. 44) we have 














42 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. ! 
—E-(VX H)—H-(VXE)=V-(EXH). 
Hence 
flE-(WxH)-H(vxB]dv= — fV¥-Ex Hav = 
=- {(EXH)-dS, 
and instead of (12.3) one can write 


2 2 
Gt =fj-Eav-—2 f (EX H)-as—- 2 oe a. (12.4) 


Consider the case when the integration volume V increases indefinitely 
until it covers all space. If the field vectors Eand H tend to zero more rapidly 
than 1/r as r>o, then.the surface integral reduces to zero. Indeed, the 
integrand decreases more rapidly than 1/r2, while the magnitude of the sur- 
face increases as r2. Then (12.4) reduces to the equality 


dWw_ d 


TALJ. uy dV, (12.5) 
where 
E? + H2 
ug Sarees (12.6) 


Since the left-hand side of (12.5) represents the work done per second, the 
right-hand side represents the decrease in field energy per unit time. 

In a closed system consisting of a field and particles the work done by the 
electromagnetic field on the particles is equal to the decrease in the energy 
of the field itself. In this case one has to ascribe to the electromagnetic field 
an energy whose density ug is expressed by formula (12.6). The expression 

7e (= +H? av. 


8m 


cannot be reduced to quantities which are determined only by the relative 
position and motion of charges. Hence it cannot be assumed to be the poten- 
tial energy of the system of interacting particles. In particular the field 
energy density differs from zero in a region of space which is free of particles. 


§12 ENERGY CONSERVATION LAW 43 


The possession of an energy by the electromagnetic field obviously shows 
that the field can in no way be considered as a mathematical fiction, a con- 
venient method of calculating the interaction between charged particles. On 
the contrary, the field is as real as the particles. We shall convince ourselves 
more than once of the reality of the electromagnetic field on the basis of 
other facts also. However, within the framework of classical electrodynamics 
the interrelation between the field and the charges remains unexplained. Only 
quantum electrodynamics, which will be expounded briefly in Part V of this 
book, allows one to comprehend more profoundly the essence of the inter- 
relation between the field and particles. 

Let us now consider a region of the field which has a finite volume V anda 
limited surface S. Then eq. (12.4), expressing the energy conservation law, 
shows that the decrease in the field energy per unit time 


ð fE? +H? 
== || SS al 

ðt 8r d 
in a certain volume V is equal to the work done by the field forces dW/dt per 
unit time on the charges contained in the same volume, together with the 
flux 


Z f (EXH) 4s, 


flowing through the closed surface S surrounding the volume V. It is obvious 
that this flux must be interpreted as an energy flux flowing out of the volume 
V. The energy flux is the electromagnetic field energy flux, since it also dif- 
fers from zero when no particles pass through the surface carrying away 
energy. The electromagnetic field energy flux is characterized by a vector o, 
called the Poynting vector, which is equal to 


=e 
o= 7 EX H. (12.7) 


The Poynting vector o represents the field energy flux flowing through 
1 cm? in the direction perpendicular to the field vectors Eand H, and forms 
with them a right-handed screw system of coordinates. 

Later we shall present a number of examples of the calculation of the 
Poynting vector. Here we shall confine ourselves only to the following re- 
mark. The vector o is determined formally only with an accuracy to within 


44 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


the curl of a certain vector a. Assuming o’ = 6 + V X a, we have 
fo'-dS= fo-dS+ f (VXa)-dS =fo-dS, 


since the integral of V X a over a closed surface is always equal to zero. 

In reality, however, as will be shown in Part II, o should be interpreted as 
the density of the field energy flux, assuming V X a= 0. Sometimes as an 
example of the inadequacy of such an interpretation of o the case of crossed 
static electric and magnetic fields is quoted. Formally, in this case o + 0, 
although there is no energy flux. However, it is forgotten that the vector o 
must enter into the energy conservation law expressed by formula (12.4). The 
latter loses any sense in static fields, if it is assumed that o # 0. 

In deriving formula (12.5) we have assumed that the integral $ o- dS re- 
duces to zero in integrating over a closed surface of infinitely large radius. We 
shall see that in the problems of radiation theory fields are encountered which 
decrease with increasing distance according to the law |El~!HI~ 1/r as 
roe. In this case the integral $ o-dS, taken over an infinitely distant sur- 
face, will have a finite value. Physically this means that the system losing a 
part of its energy emits radiation. 

Writing the energy conservation law in the differential form 


OEE 
ðt 8T 





-j-E- V-o, (12.8) 


we draw attention to its analogy with the continuity eq. (8.5), expressing the 
charge conservation law. The left-hand side of formula (12.8) represents the 
change in the field energy of unit volume (change of a conserved quantity), 
while on the right-hand side there stands the work produced on the charges 
contained in this volume, as well as the divergence of the flux of the con- 


served quantity (the energy density). 


§13. Momentum conservation.law of the electromagnetic field 


` The electromagnetic field possesses momentum density as well as energy 
density. 
We shall consider the change in the momentum of the particles confined in 
a volume V. The calculations are carried out in the same way as in deriving 
the energy conservation law. 
From eq. (8.8) we have 





§13 MOMENTUM CONSERVATION LAW 45 


dP d 1 
parti GaS E t a 
Se eis fo (E+ vx H) av 


= foEdv+ {2 jx Har, (13.1) 


where Poart is the total momentum of the particles. Expressing p and j in 
terms of the field strengths according to (8.4) and (8.3), we find 


d dE 
apt pat as L fEçy- E)av- yJ Se X Hav 


TANTI (13.2) 


We symmetrize the above equation by adding to its right-hand side the 
expression 


aly 


1 
an E+7 HVH, (13.3) 


aa (x E+- l mae 
ðt 


which is equal to zero. Then we have 


d 
dt P part = — ame y SEX Hav (13:4) 


+f {E(V-E)+ H(V-H) —EX (VX E) -HX(V X H)}aV. 


The second integral can be transformed into a surface integral. This trans- 
formation will be carried out below. It is clear that a surface integral contain- 
ing field yectors to the second power will tend to zero as the surface increases 
indefinitely, provided the field vectors decrease more rapidly than the func- 
tion 1/r. Then, passing over to an infinitely large volume and discarding the 
second integral in (13.4), we arrive at the expression 


GLi 
at ake I J Ex Hav= const . (13.5) 


Formula (13.5) shows that the total momentum of a closed:system con- 


















46 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


sisting of a field and particles is conserved. The quantity 


Bele: 
Bor EX H (13.6) 


represents the momentum density (the momentum per unit volume) of the 
electromagnetic field. For the interaction between a field and particles there 
holds, in addition to the total energy conservation law, the total momentum 
conservation law. The- transfer of a momentum to the particles is accom- 
panied by a decrease in the momentum of the field. A momentum loss of the 
particles (for example, in the emission of radiation) leads to an increase in the 
momentum of the field. 

Now we shall show that the second integral in the form of (13.4) can be 
reduced to a surface integral. Since the integral 


J {E-E + HV -H) -EX (YXE) -HX (vx H)} av 


is symmetric with respect to the vectors E and H, we shall consider only the 
integral 


1= [{E(V-E)-EX (VXB)} dV. (13.7) 
Making use of the vector identities (I. 53) and (I. 48) we can write 

[EVE av=f(E-n)E as- JEC -E)av, (13.8) 

EWE av=[v 46? av- fEX(Y X E)av. (13.9) 
Subtracting (13.9) from (13.8) we find 

J {E(V-E) - EX(VXE)} dV =- [vi e2av+f(E-n)E as. 
Taking into account (1.23), we obtain 

1=f {(E-n)E-n3 £?’ } ds. 


An analogous expression can be written for the magnetic part of the integral 
which interests us. Thus, finally, 





Sas) est S 


§13 MOMENTUM. CONSERVATION LAW 47 
J {Eqv-E) — Ex (YXE) + H(V-H) — H x (VxH)} av = 
= {{E-n)E-n} E? +(H-n)H-n4H?}ds. (13.10) 


Letting the radius of the integration surface go to infinity and assuming 
that the fields E and H decrease at infinity more rapidly than 1/r, we arrive at 
the statement expressed earlier that the whole surface integral is equal to 
Zero. 

We shall not dwell on the consideration of a more complex case when the 
integration in (13.4) is carried out over a finite volume *. 

We shall only state the result of such a case: the change in the total mo- 
mentum of a field in a certain volume, f g dV, is equal to a change in the 
momentum of the particles confined in the volume and to the momentum 
flux through the surface bounding the volume. 

The theoretical prediction of the existence of field momentum was first 
confirmed experimentally by P. N. Lebedev in 1901 in the form of the pres- 
sure of light. The electromagnetic field momentum is small under ordinary 
conditions and often lies below the limit of experimental errors. However, in 
the realm of atomic phenomena the electromagnetic field momentum be- 
comes comparable with the momentum of particles and plays a paramount 
role in all processes of interaction between radiation and matter. Besides, in 
the realm of atomic phenomena, the radiation pressure plays a most essential 
role in processes taking place inside stars and star atmospheres and in other 
astrophysical phenomena. 

It is interesting to note that between the momentum density vector g and 
the Poynting vector there is the relation 


=. (13.11) 
c 


‘In the chapter devoted to the theory of relativity we shall see that there is a 
very general relation between the energy and momentum, from which the 
formula (13.9) is obtained as a consequence. 


'* See R. Becker, Electromagnetic fields and interactions (Blackie, London, 1964); 
I. E. Tamm, Osnovy teorii elektrichestva (Introduction to the theory of electricity) 
(Gostekhizdat, 1954) p. 506; and in particular, Y. I. Frenkel, Elektrodinamika (Electro- 
dynamics) (GTTI, 1934) p. 235, where a more complete interpretation of the expression 
(13.6) is given. 





—s 


t 


| 


48 GENERAL THEORY OF ELECTROMAGNETIC FIELD Ch. 1 


In addition to the vector of the field momentum density g one can con- 
sider the angular momentum density` 


Kg =rX B= 7 1x (EX H). 
The field angular momentum in a volume V is equal to 
Jk 
k=z [rx (EX Hav. 


It can be shown that a conservation law holds for the angular momentum, as 
well as for the energy and linear momentum. The angular momentum of the 
electromagnetic field plays an important role in atomic processes. In phenom- 
ena on the macroscopic scale the angular momentum has been measured 
relatively recently * 


 *I, E. Tamm, Osnovy teorii ełektrichestva Untroduction to the theory of electricity) 
(Gostekhizdat, 1954) p. 502. 








The Electrostatic Field 


§ 14. The electrostatic field 


Having formulated the general equations of the electromagnetic field and 
discussed the basic consequences following from them, we can proceed to the 
discussion of particular cases of electromagnetic fields. We shall progress 
successively from the most simple to more complex cases. 

The simplest example of the electromagnetic field is the field of charges at 
rest. ` 

We write Maxwell’s equations for this case 


Differential form Integral form 

V-E=4np, (14.1) SE- dS=4r 3e; (14.1) 
VXE=0, (14.2) fE-dl =0, (14.2!) 
V-H=0, (14.3) fH-dS=0, (14.3') 
VXH=0, (14.4) fH-dl =0. (14.4') 


The system of equations of the electromagnetic field reduces here to sys- 
tems of independent equations for the electric and magnetic fields. The 
solution of the equations for the magnetic field, which does not depend on 


49 


50 THE ELECTROSTATIC FIELD Ch. 2 
time, has the trivial form 
H=0. 


This means that charges at rest are not surrounded by a magnetic field. -The 
electric field associated with charges at rest is, ds we have already mentioned, 
irrotational. Its sources and sinks are the charges. 

In practice one most often needs to find the electric field distribution 
when the distribution of the charge density in space, p(r), is known. For this 
it is necessary to integrate the system of differential eqs. (14.1) and (14.2) for 
a given function p(r). This is the so-called direct problem of electrostatics. 

An incomparably simpler, but more seldom encountered problem of elec- 
trostatics is the inveftse one — finding the charge density p(r) from a given 
field distribution E(r). In order to solve the inverse problem of electrostatics, 
it is sufficient, according to (14.1), to find the divergence of the given field. 

As we have already mentioned.in §4, to find the general solution of the 
equations of the electrostatic field it îs convenient to make use of the method 
of electrostatic potential. According to (4.11) or (10.2), it can be assumed 
that 


=-— Vy. (14.5) 

From (14.1) we obtain Poisson’s equation: 
Vp ARp (14.6) 
Eq. (14.2) will be satisfied automatically by introducing the potential 
according to formula (14.5), since for an arbitrary form of the function y(r) 


there holds the equality 


VX Vy=0. 


PLT, 


Th 
AR 
=- 
G 


Consequently, Maxwell’s equations for the electrostatic field (14.1) and 

` (14.2) are completely equivalent to Poisson’s equation. Knowing its solution 
-— the scalar potential y(r) — one can determine the strength of the field E by 
differentiation. 

It should be stressed that only the field strength Ehas a physical meaning. 
The scalar potential is only an auxiliary, though very convenient, quantity. 
The value of the potential is détermined in electrostatics to within an arbi- 
trary constant: adding any constant a to the potential y, we arrive at a potential 


S 





§14 THE ELECTROSTATIC FIELD 51 


y' =y +a, which corresponds to a field E= -— V y’ = — V ọ. This transforma- 
tion is a particular case of the gauge transformation considered in §11. Be- 
cause of the incomplete determination of the potential it is senseless to speak 
about the numerical value of the potential y at a given point of the field. 

Henceforth, in considering solutions of Poissons equation, and in dis- 
cussing other properties of the potential, we shall assume a definite behaviour 
of the potential ọ at infinity. If it is assumed that all the charges are distri- 
buted in a finite region of space surrounding the point chosen as the origin, 
then as r >œ the field strength will not decrease more slowly than 1/r?. In 
accordance with this, the solutions of the Poisson equation must satisfy the 
requirement 


y>O as r>o. (14.7) 


From the mathematical standpoint Poisson’s equation, a second order 
partial differential equation, is in some respects more convenient and simpler 
for the calculation than the field eqs. (14.1) and (14.2), which are first order 
partial differential equations. If the potential y at infinity satisfies the con- 
dition (14.7), then the solution of Poisson’s equation can be written in a 
general form. 


In §2 the general solution of the eq. (14.6), 


O eosa = foe) de 


Ir—r'l 


p(x', y’, z’) dx’ dy’ dz’ 
Vx —x')? +(y-y'? + e-z) 





(14.8) 





was given without proof (the latter will be given in §24). Knowing the dis- 
tribution of the charge density in space p(x’, y’,z’) and integrating over all 
space, one can find the value of y at any point (x, y, zZ). 

The actual calculation of the field according to formula (14.8), requiring 
the calculation of a triple integral, often turns out to be impracticable. Later, 
in Part IV, we shall discuss briefly the basic methods of solving the problems 
.of electrostatics taking into account the specific properties of physical bodies 
— dielectrics and metals. Here we shall confine ourselves only to the simplest 
system — a system of point charges. 


SF RO =< 





52 THE ELECTROSTATIC FIELD Ch. 2 
§15S. The electrostatic field of a system of point charges 


The contrivance of considering the charges as spread out in space and the 
description of the properties of a system of charges by means of the con- 
tinuous function p(r’) allowed us to pass over from the integral relation 
(14.1’) to the differential equation (14.1). The importance of this transition is 
clear from the fact that it made it possible to formulate differential field 
equations. Nevertheless, in some cases it is inadmissible to disregard the point 
structure of the charges forming real systems. Moreover, in a number of cases 


_it turns out that it is more convenient to carry out the calculations for point 


systems than for distributed systems. 
We write the charge density, characterizing a system of point charges, in 
the form 


p= D; e,ô(r' <= r;) 2 


where rọ is the radius-vector of the charge e;. Substituting this expression into 
(14.8), we find the field potential of a system of point charges, 


pdy êl =r) _, e; 
e A a 2 e o 


r—r'l 


where R; = |r—r;|, and r is the radius-vector of the observation point. Here 
we have made use of the basic property of the delta-function (see III.3). 
Thus, the solution of the equation 





V7p= —4ne6(r—rp) he (15.2) 
is the function 
= e 
2- Trey © (15.3) 


Formula (15.3) represents a useful relation which we shall use in what fol- 
lows. ; 
The field of a system of point charges is given by the formula 





§15 SYSTEM OF POINT CHARGES 53 


In the case of a single charge, formula (15.4) gives 


=£ R. (15.5) 
R? 
For the force acting on a test charge e placed in the field of a single charge 
one obtains the Coulomb law 


F=—R. (15.6) 
R3 

If the number of charges in the system is large, the sums in formula (15.1) 
and (15.4) contain a large number of terms, and these formulae become of 
little use for practical calculations. However, formula (15.1) allows a sub- 
stantial simplification at distances from the system which substantially ex- 
ceed its spatial extension. Distances which are large in comparison with the 
dimensions of the system will be called, for brevity, large distances. If the 
observation point N is located at a large distance from the system, then there 
holds the inequality (fig. 1.6) 


Iri > lIr;l. 
Let us consider one of the terms in formula (15.1). In order not to encum- 


ber subsequent formulae by indices, we shall not write the sign of the sum, 
and the distance from the ith charge to the observation point will be written 












54 THE ELECTROSTATIC FIELD Ch. 2 
in the form 


1 1 1 


R; Ir-r V@e-x)?+0-y)? + E-z) 


sH (15.7) 


/ 3 
Dy Ca žia) 


a=1 


where the index æ signifies the three components of the corresponding vec- 
tors. Since lxil << Ix, , expanding (15.7) in the Taylor series we have 


i eee , |_0 (1 
R; Ir-r; r 27 žia (2 (3) | sg 
a 


1 e a2 (£) 
T > > oe oe a [|= oP Gao 15.8 
2 ae iœ” iß En r) jo ( ) 
Substituting the expansion (15.8) into formula (15.1), we find 


De 











Esi , Gal 
Deere r (Dexia) rE () + 
a i a 
1 0 a2 1 os 
+5 Dy 2 (xia js) adr, (?) +..5 
a,B i B 
=p ty, tP t- (15.9) 
where 
we 
į e 
D (15.10) 


nD ra) aC) osm 


Pex: xe ee 1 (15.12) 
TE iœ iB] dxyðxg \r] í 


§15 SYSTEM OF POINT CHARGES 55 


The summation with respect to į is carried out over all charges of the sys- 
tem. Hence > e; = e represents the total charge of the system. 


We see that at large distances the ratio between two successive terms of 
the expansion of the potential is of the order of magnitude of the ratio 
size of the system 
distance to the observation point ` 
as the potential of the field produced at the given point N by a charge e equal 
to the total charge of the system. Every subsequent term of the expansion 
contains the above mentioned ratio, of size of the system (~!x'l) to the 

distance to the observation point | rl, to a higher power. 

In the case of an electrically neutral system the total charge e = XP e;=0, 
and the first term of the series (15.9) vanishes. We shall deal with such systems 
very often. It is sufficient to point out, for example, that all atoms and mole- 
-cules are electrically neutral systems. The potential of the field produced by 
an electrically neutral system of charges is given by the expansion (15.9) be- 
ginning with the second term ,. Let us consider it in more detail. We write 
Yı in the vector form 


Sai 6) 1 ve 
aED a) a |e oD el oie 
a i a i 
a 1 ' 
Zh (v+): DE (15.13) 
i 


where the gradient is taken with respect to the coordinates of the observation 
point. The quantity 


d= >) e;r; = fpr’ av’ (15.14) 


is called the dipole moment of the system. In the particular case of a system 
consisting of two charges of the same magnitude but opposite sign, which is 
called a dipole, the dipole moment is equal to 





The first term of the expansion ọọ is the same 


=; ' e= , , 
d=e,r, ter, = lel(r] —r), 


-i.e. to the product of the value of the charge and the vector 1=(rj — r3). 
The field of an electrically neutral system in the first approximation 
(called the dipole approximation) is written in the form 


















56 THE ELECTROSTATIC FIELD Ch. 2 


yxy =—d- ył peer cos! (15.15) 
1 r r? r2 


where @ is the angle between the dipole moment and the radius-vector drawn 
to the observation point. Thus, the potential of the field of an electrically 
neutral system decreases (at large distances from the system) according to the 
law y~ 1/r?. 

Just as we have passed over from point charges to a charge distributed con- 


` tinuously in space with density p, one can introduce the notion of the dipole 


moment density p distributed continuously in space. By definition, p is the 
dipole moment per unit volume. The potential of the field produced by the 
overall system can, obviously, be written in the form 


e=-fp-(v2)av’, 


where p-{V(1/r))dV’ is the potential produced at the point N by the dipole 
moment contained in a volume dV’, and the integration is carried out over 
the entire volume of the system. It should be noted that instead of the differ- 
entiation with respect to the coordinates of the observation point one often 
makes use of the differentiation with respect to the coordinates of the source. 
Then according to (1.17) we have V(1/r) = — V'(1/r) and instead of (15.15) 
we can write 


- o=a-(v'4), (15.16) 
or 
v= fp (v2) day. (15.17) 


We shall need formula (15.17) later (see Part IV). 

Consider the problem of the dependence of the dipole moment on the 
choice of the origin. Assume that we locate a new origin at an arbitrary con- 
stant vector a with respect to the old origin, i.e. we perform the transfor- 
mation 


j ” 
teat a, 
r; r; a 





Then the dipole moment will be equal to 


§ 16 QUADRUPOLE MOMENT 57 
= ws n" er 
d= 276k = 27 eh +2yea=d +aQye, 
where d'= >» e,t;. If the system as a whole is electrically neutral, then 


D> e;=0 and d' =d. In: this case the value of the dipole moment does not 


change when the origin is displaced. If, on the contrary, the system possesses 
a total charge, then d’ # d and, consequently, the dipole moment of the sys- 
tem depends on the choice of the origin. It is then always possible to find a 
value of a such that the dipole moment reduces to zero. Thus, the dipole 
moment of any system possessing a total charge may be considered as equal 
to zero. 

We now determine the field of an electrically neutral system in the dipole 
approximation. 

We have 


d'r 
eS Vyox- ve, =- v (2) = 
r 


= V(d-r)— (d-r) (v4) 


Calculation by means of formulae (1.47) gives 


_ 3r(r-d) — 2d 
EPESI ae 


E (15.18) 


At large distances the field of an electrically neutral system decreases accord- 
ing to the law E~ 1/r3 and has a strongly pronounced asymmetry. In polar 
coordinajgs (r, @) its components (according to 1.71) have the form 


EAE 00 PCD radial component , 
ər r3 

1B) = 10- J60 oe meridional component . 
rò r 


§16. Quadrupole moment 


If the dipole momènt.of an electrically neutral system of charges is equal 


qin et 





58 THE ELECTROSTATIC FIELD Ch. 2 


to zero, then in the expansion of the potential (15.9) one has to take into 
account the expansion term 5. 

An example of an electrically neutral system with dipole moment zero is 
the system of two dipoles of the same magnitude with opposite directions of 
the dipole moments placed an ‘infinitely small distance apart. Such a system is 
called a quadrupole. 

The potential of the field produced by a quadrupole has the form 


a2 1 
Py = Db eX ia ip CASA ax, (2 JE (16.1) 


CR 


In order to obtain y one has to calculate the expression 


a2 (2) a (2 ‘) a Xz 
dxadxX g r dxa aX, r OX gy r? 
É ð 1 1 dxg 1 3XqXy 
=) <3) Sy ae a 
Oxy r r? 0Xy r r 








> 


where 5,4, is the Kronecker symbol, a, 8 assume the values 1, 2,3, and x œ Xp 
represent an abbreviated notation of the coordinates x) =x, x7 =y, X3 =Z. 
We then have 





3XqX_ ô 
=i N sd! CPi es) 
T2) D> eX iaig ( E5 r3 ) 2 
a,p i 
The whole set of the quantities Die frer is a second-rank tensor. This 


tensor is called the quadrupole moment of the system and is denoted by Dag- 
Then 


SXATO 
z 1 Q Bie ap 
$2 =5 2 ; ) Pap (a 3 3 #2) A (16.2) 


Dag ee xig- (16.3) 


If one passes over to a continuous distribution of charge, the quadrupole mo- 
ment can be written in the form 


Dag = foxyx, dV’. (16.3’) 


§16 QUADRUPOLE MOMENT 59 


Omitting for brevity the index of summation over all the particles, we can 
write the expression for p, in the coordinate-representation: 


-1y.,],2 CS ‘) 2 (2 1 ) 2 (= 1 ) 
Q= > eix “|= —] ty * =] +2*|— -—)+ 
aa | 5 P Pe 7 Pr rP 


13 rit, ne 
+2xy *Y + 2x'2 Os 24 : 
5 5 
r r r5 





This expression is usually rearranged by adding to it the quantity 


r23 (Sre) i 


1 
aes 2 


which is equal to zero, and by re-grouping the terms in the form 
n=} De {(x2 -4 2) (32-4) 
7 3 r5 r? 


N 2 
B a ANSA al (2) wk p12) (S22 
+(y 37 Ne 3 +1Zz 37 T 3 + 


+ 2x'y' oe +2x'z' 3xz n 2y'z' az) = 
r r r 


3XqX_ ô 
gih ste Os T 
ay. ` Pap 3 3 ). (16.4) 
In this case the quadrupole moment Dy, is defined as ° 
= in ' 12 
Dag = 2 jei (¥ia*ig =A Bap): (16.5) 


i 
The set of the quantities Dy is easily written in the explicit form: 
Deg LEM -arl Dey =Pye = Dep» 
Dey ely)? -5 771 D EDRI eV i245 
DMZ Dele)? = ; Pl ADi =D, = 276224 - 





60 aye” THE ELECTROSTATIC FIELD Ch. 2 


All of the nine quantities Dag forming the quadrupole moment depend, ob- 
viously, only on the positions and values of the charges of the system. 

From the definition (16.5) it is clear that the quadrupole moment tensor 
is symmetric, so that 


Dag = Dga: 
The symmetric second-rank tensor has six independent components. 
Further we note that the sum of all diagonal components of the quadru- 
pole moment is equal to zero: 


Dex * Dyy tD; =0. (16.6) 
This reduces the number of independent components of the quadrupole mo- 
ment to five. 

Like any symmetric tensor, Dy, can be reduced to principal axes. This 
procedure is completely analogous to. the reduction of the inertia tensor to 
principal axes in mechanics. Namely, we perform a rotation of the coordinate 
frame. Then to the coordinates x;,y;,z; there will correspond new coor- 
dinates x;, y;,z;. We choose the coefficients of the corresponding linear trans- 
formation in such a way that the components of Dyg with different values of 
the indices a and ß reduce to zero. It can be shown that such a choice of the 
coefficients is always possible. 

With the new axes we- shall have: 


D; =D exi —X}p -x)= D]; , 
D, =i e (2x - A - x5) = D72; 
D, =— (0, +D,)=D3; - 
An important case is the system of charges whose arrangement possesses 


axial symmetry. Let the symmetry axis be the x-axis. The condition of sym- 
metry with respect to the x3-axis allows one to write 


Dee 2 
Gear Ce Fy) 


so that the quadrupole moment has the components 





§16 QUADRUPOLE MOMENT 61 
D; SD ave aE 
D3 =e exe, -x})= - SD A 
D, =D. 


The sign of the quantity D is called the sign of the quadrupole moment. 
According to (16.4), the potential of the quadrupole field is equal to 


r2— 3x2 3 1—3 cos? 0 
=-2pD À 
rs r3 


D 





Alw 


P2 


where 0 is the angle between the symmetry axis x} and the radius-vector r of 
the observation point. The general character of the law of decrease of the 
quadrupole potential with increasing distance r is 


~ 3 
5 Wir: 
Correspondingly, the electric field decreases according to the law 
[EI ~ 1/r*. 


As we have already mentioned in the preceding paragraph, if a system of 
charges as a whole is not electrically neutral, then the value of the dipole 
moment depends on the choice of the origin. The same also holds for the 
quadrupole moment. For the case of a system with a total charge differing 
from zero it is conveniént to locate the origin at a point with a coordinate 


hie Dy eit 
0 he, 


Tfiis point can be called the centre of charge of the system. If the origin is 
located at the centre of charge, then the dipole moment of the system of 
charges reduces automatically to zero. However, this does not hold for its 
quadrupole momenf. Namely, if the disposition of the charges th the system 
is not spherically symmetric, then some or all components of the quadrupole 
moment differ from zero. Hence the presence of a quadrupole moment in a 
system of charges allows one to infer something about the symmetry of the 
system. Thus, for example, the presence of axial symmetry leads to the field 
distribution written above. 























62 THE ELECTROSTATIC FIELD Ch. 2 


In connection with this fact the discovery of a quadrupole-moment in a 
number of atomic nuclei is of great significance. The presence of the quadru- 
pole moment of nuclei showed that their form is non-spherical. 

If the distribution of charges in a system possesses a very high symme:-y, 
its quadrupole moment may tum out to be equal to zero. As an example, we 
cite a system of eight charges situated at the vertices of an infinitely small 
parallelepiped with a regular alternation of the signs of the charges. Such a 
system of charges, called an octupole, has neither dipole nor quadrupole 
moment. The potential of the octupole field is obtained by taking into 
account the fourth term of the expansion (15.9). 

Continuing the process of expansion (15.9) one can obtain the field poten- 
tial of multipoles of an arbitrary order. 


§17. Work and energy in an external electrostatic field 


According to the above, the work of displacement of a test charge from 
one point of a field to another can be expressed in terms of the change in the 
potential in the form 


2 2 2 
W= f F-dl=e f E-dl=~e f Vy- dl= 
1 1 1 


=el[y(r,) — 9(r,)] =—eAy. (17.1) 


If a displacement of the charges of a system takes place, then the work of 
the displacement by vectors dl; is equal to 


dW =D }e,E-dl,. 
Later we shall need the expression for the rate at which work is done on a 
system of charges. We find 

dw 

a DG gE SiD e Ev. (17.2) 


For the case of distributed charges 


dw 


aT = fov: -EdV = = fi- Edy. (17.3) 





§17 WORK AND ENERGY 63 


Knowing the work of displacement of a test charge, we can write its poten- 
tial energy in the electrostatic field in the form 


—§U(r) =5W=—cby(r), 


where U(r) is the potential energy at the point r, and y(r) is the electrostatic 
potential at the same point. The form of the potential energy does not de- 
pend on the choice of the coordinate frame. Hence the relation 


U= ep (17.4) 


is valid not only in Cartesian coordinates but also in any generalized coor- 
dinates q;. The generalized forces acting on the test charge can be written in 
the form 


O E (17.5) 


Formulae (17.4) and (17.5) can be applied to the case where instead of a 
test charge an arbitrary system of charges is placed in an external field. In this 
case it is assumed that the external field E is strong in comparison with the 
field produced by the charges of the system. It is assumed moreover that the 
potential of the external field varies sufficiently slowly from point to point. 
The potential energy of the system of charges can be written in the form 


U=} epr), (17.6) 


where (rj) is the potential of the external field at a point r;. Choosing the 
origin inside the system, we can write the potential in the form 


yr) =~’, y',2), 


where x’, y’',z’ are the distances from the origin to the charge. We now make 
use of the slowness of the variation of the potential of the external field in 
the region of space occupied by the charges. The slowly varying function y 
can be expanded in a series in the quantities x’, y’,z’ characterizing the 
extension of the system, and we can confine ourselves to the first terms of 
the expansion. This gives 


a 


64 THE ELECTROSTATIC FIELD Ch. 2 


g(r’) = ¥(0, 0, 0) +x! 28 + y' 8 +2’ SS = 


= (0) +r'- Vy=9(0)—r'- E(0). 


Here (0) and E(0) are respectively the potential and strength of the external 
field at the origin. Substituting the last expression for y(r’) into (17.6), we 
find 


U= ev~ De, (0) — >D e;r; E(0) = 
= (0) 2) e;— EO): 2 7 e;r; =ey(0)—E(0):d. (17.7) 


In the first approximation the potential energy of a system of charges in 
an external field is equal to the energy of one charge of the value e = Dit 
located at the origin. In the case of an electrically neutral system e = O and 


= —d-E=—d£cosé@, (17.8) 


where @ is the angle between the dipole moment of the system and the exter- 
nal field vector. 

Let us find the generalized forces acting on a system (assuming the latter 
to be undeformable, so that the distribution of the charges in the system is 
fixed). The generalized force corresponding to coordinates x, y,z is equal to 


=— VU=V(d-E), 


or, evaluating the gradient of the product by means of formula (1.47) and 
taking into account that d is a constant vector, we obtain 


F=(d-V)E+dxX (VX E)=(d-V)E. (17.9) 


In a uniform field (E= const) an electrically neutral system with a dipole 
moment is not acted upon by any forces tending to displace it in space. Such 
forces exist only in a field which is non-uniform. 

The generalized force corresponding to the generalized coordinate 6, which 
determines the orientation of the dipole moment vector, according to a well- 
known proposition of classical mechanics * represents-a couple: 


$ . Landau and E. M. Lifshitz, Mechanics (Pergamon, Oxford, 1960). 


§18 ENERGY OF SYSTEM OF CHARGES 65 


ae he 
M=- g= dE sind . (17.10) 


The couple tends to turn the system in such a way that its dipole moment 
will be orientated parallel to the field. 

The formulae found above allow one to find easily the law of charge- 
dipole and dipole-dipole interaction. For this E(0) is understood to be the 
field produced at the point 0 by the charge or the dipole respectively. 

For the potential energy of the charge-dipole interaction we find 


erd 


U=-d-E= 
r? 


: (17.11) 


where r is the vector directed from the charge to the system, the value of 
which is equal to the distance from the, charge to the system (in this approxi- 
mation we have to disregard the spatial dimensions of the system). 

The dipole-dipole potential energy, according to (15.18), is 


_ (dı -dy)r? —3(d, -r) (d3 ‘r) 
r5 ; 





u=—d,-E, (17.12) 


‘where r is the vector connecting the two dipoles. 


§18. The interaction energy of a system of charges 
and the electrostatic field energy 


We pass on to the calculation of the energy of a system of interacting 
charges. This energy can be calculated most simply in the following way. Let 
a charge e) be fixed at a certain point of space. A charge e7, which was 
initially at infinity, is displaced to a point located at a distance r; from the 
first charge. In this case the external source must do work against the forces 
of the field: 


Wig =e yl, 12)- 9,7 > ©) = e291 (719) - 


Since the potential of the field of the first charge at infinity is equal to zero, 
1 (7,2) represents the potential of the field of the first charge at the point 
12, which is equal to 





66 THE ELECTROSTATIC FIELD Ch. 2 
9102) = 1/712 - 
Hence the work of displacement of the second charge is equal to 


eje? 
12 "12 2 


If a third charge is added to a system of two charges, it is necessary to 
produce the work 


€1€3  €2€3 
= 17 Ss 
1233) 723 





Continuing with such a procedure for a system of N charges, we find that for 
this one has to expend the work 


e e e e e 
Wate, [A+ ene] +e É +— +t 
2 ina Be TiN Tra 123 
en ee 
e pa ease G#k). (18.1) 
72N ik Vik 


The coefficient + is introduced because in the double sum the same terms, 
“corresponding to each pair of particles, for example e,e /r, and eye )/r, 
are encountered twice. In order not to introduce a complicated restriction 
upon the performance of the summation, in (18.1) all such terms are taken 
into account and the result is reduced by a factor of two. 

Introducing y; which is the potential produced by all the charges except 
the ith one at the locus of the latter, we can rewrite (18.1) in the form 


Wa) e)?;- 


The work done in forming the charge distribution is equal to the potential 
energy stored in the system of particles. Thus, 


=a Yi 
Y= 26: 3D eiel (e2) 


Passing over from point charges to a continuous charge density distribution, 
one can write (18.2) in the form 





§18 ENERGY OF SYSTEM OF CHARGES 67 
=o! pwe or ¥ 
U=} fppav=} ; JARA dVdv’. (18.3) 


The potential energy of the interaction (18.2) is determined by the instan- 
taneous positions of all charges in the system. Formulae (18.2) and (18.3) 
can be interpreted in the following way: each charge contained in the system 
possesses a potential energy Seis and the energy of the system is made up 
of the energies of the charges constituting it. 

We now transform formula (18.3), making use of the field equations. Ex- 
pressing p in terms of E according to (14.1), we find by means of (18.3) 


u= 2 fov-Eav= 2 ([v-(E)dV— fE: (Vy) 4¥] = 


-fÆ av+ l foE as. 


The integral over an infinite surface vanishes, since as r > œ we have 
g<l/r, E<ifr*, S~r?. 


Hence we find finally 
U= JẸ E? av. (18.4) 


Formula (18.4) is completely equivalent to (18.3). However, it does not con- 
tain any quantities characterizing the electric charges. 

It is quite clear that the expression (18.4) is a particular case of the gen- 
eral expression for the electromagnetic field energy, and its derivation is a 
particular case of the proof presented in §12. However, it is inherent within 
the framework of electrostatics that neither of the two alternative formulae 
(18.3) and (18.4) is more fundamental. Since formula (18.3) does not contain 
any field characteristics, then in electrostatics the field can be treated as an 
auxiliary, purely mathematical method of describing the interaction between 
particles. The state of the system and its energy in electrostatics are deter- 
mined solely by the values of the charges and their relative positions. 

We have already emphasized earlier that in the general case of a system of 
moving charges and fields varying in time the situation differs radically. from 
that in electrostatics. The electromagnetic field cannot in general be treated 


itt a 


Set Se 


“mr 


ka 


ne er ee 


58 ‘ THE ELECTROSTATIC FIELD Ch. 2 


as a purely mathematical concept. It is a physical object as real as charged 
particles, : 
It is interesting to apply formula (18.4) to a single elementary charge — an 


_ electron or proton. Its energy is equal to 


U=e9(0), 


where (0) is the field potential at the point at which the charge is located. 
Since the field considered is the -field of the charge itself, its potential y = e/r 
increases indefinitely as r tends to. zero. This means that a point particle 
would have an infinitely large self energy. 

Thus, the concept of particles as point objects having no spatial extension 
leads to a physically senseless result. In this connection a number of attempts 
have been made to construct an electrodynamic theory of elementary par- 
ticles having finite dimensions (theory of the extended electron). As we shall 
see in Part II, this theory turned out to be incompatible with the basic prop- 
ositions of the theory of relativity. 

In the problem of the self energy of an elementary charge, classical electro- 
dynamics encountered an insurmountable difficulty. It was clear that the 
laws of classical electrodynamics, which had been in excellent agreement with 
experimental facts in the field of macroscopic physics, had a limited region of 
applicability. In going down to very small distances they must undergo funda- 
mental changes. We shall speak about the limits of applicability of classical 
electrodynamics in later paragraphs of Part I. 








The Quasistationary Magnetic Field 


§ 19. The field of a system of charges undergoing 
a slow quasistationary motion 


Next in degree of complexity is the case of the field of charges performing 
a slow quasistationary motion. 

We shall call the motion of a system of charges slow when the velocities 
lvl are small in comparison with the value of c, the unique characteristic 
quantity with the dimension of velocity contained in Maxwell’s equations. We 
shall see later that the velocity c represents the velocity of propagation in 
space of all electromagnetic interactions. 

Thus, the assumption of a slow motion of charges means that the finite 
velocity of propagation of electromagnetic fields can be disregarded (see 
§23). For a slow motion it can be assumed that the field at any instant of 
time is determined approximately by the instantaneous disposition of the 
charges. 

We shall understand quasistationary motion to be a motion of charges in 
a certain limited region which the charges do not leave during the motion. In 
this region the charges may move periodically or non-periodically. In the 


latter case, however, over a very long time the particles will inevitably either _ 


pass through the same sequences of states as in a periodic motion, or, in any 
case, through sequences of similar states. In other words, the motion will be 
almost periodic. It will be shown below that under these conditions the time 


69 


ee te ete 





70 QUASISTATIONARY MAGNETIC FIELD Ching 


derivatives of fields in Maxwell’s equations are small compared with spatial 
derivatives. Hence the term quasistationary motion. 

Since the particles cannot go out of the region, the following condition 
must be fulfilled at the surface S} bounding the region: 


j,=0. (19.1) 


Here j,, is the component of the current density normal to the surface. 
For a slow motion of charges the variation of the charge density in time 
can be considered small, i.e. it can be assumed that 


V-j=0. (19.2) 


Thus, for a quasistationary motion the current density vector has a solenoid 

character. In other words, the quasistationary character of the motion of the 

charges allows one to represent their trajectories in the form of closed tubes 

or threads. Each of these tubes is closed on itself inside the region of motion. 

Such a representation is particularly obvious in the case of macroscopic direct 

currents flowing, for example, along closed conductors (see §17, Part IV). 
For every closed tube of current there holds the equality 


jdV=jdSdi=jdSdl=d/dl, (19.3) 
l eC 
I 2 2 





§19 SLOW QUASISTATIONARY MOTION 71 


where dS is the cross section, jdS is the direct current flowing through the 
cross section of the tube, and dl is an element of the tube length. The direc- 
tions of the vectors j and dl are obviously the same. The integral of the cur- 
rent density over the entire volume is 


fiav=ffara=farfal=o, (19.4) 


since the integral along the closed tube is $ dl=0. The meaning of this 
equality is very simple. Let us consider, for example, its x-component. The 
integral fj, dy dz represents the total current through the plane (yz) crossing 
the current tube (fig. 1.7). In a quasistationary state of the system the total 
current through any cross section is equal to zero. The number of charges 
passing along the normal to the cross section through all current tubes in both 
directions must be the same, since the charges are moving in a limited region 
of space. 

We now pass over to the formulation of Maxwell’s equations for the field 
of a system of charges performing a slow quasistationary motion. 

In order to find out which simplifications can be introduced into Maxwell’s 
equations for such a motion, we shall estimate the order of magnitude of the 
terms of the equations. Such methods of estimation are widely used in theo- 
retical physics. 

We begin with the estimation of the time derivatives appearing in Max- 
well’s equations. Since the system considered performs a periodic or almost 
periodic motion, the order of magnitude of the quantities |dE/arl and 
| dH/dtl can be estimated by writing 


aE| _E 


jaj 

ðt Ti ðt 

where T is the characteristic period of motion. The quantities £ and H are the 
characteristic mean absolute values of field strengths in the region of space 
occupied by the system of charges. Of course, it makes no sense to seek to 
define these quantities more precisely by relating them to a definite instant 
of time or to a definite point of space. The relations written have the meaning 
of rough estimates of the order of magnitude. 

Further, let us find the order of magnitude of the spatial derivatives of the 
field in the same region of space. The fields Eand Hin real systems perform- 
ing a quasistationary motion vary from point to point in general reasonably 
smoothly. If the mean dimensions of the system are denoted by L, then the 
order of magnitude of all spatial derivatives can be written in the form 


72 3 QUASISTATIONARY MAGNETIC FIELD Ch. 3 


REE 


Sr 








as -lò 


` In this rough estimate we disregard the field distribution .in the system and 
the specific dependence on different coordinates. 
-The quasistationary condition is that the temporal variations of the fields 
should. be sufficiently slow that in Maxwell’s equations one can discard the 
terms containing the time derivatives which have small coefficients compared 


- with those of the terms characterizing the spatial variation of the fields. For 


this the following inequalities must be fulfilled (to order of magnitude): 




















dE; 4 dH; 
Ox, f at ?; 
aH, 
ax, arle 
or 
E 1H H ITE: i 
AEA e T (19.4') 


However, in this case the following equalities must be fulfilled: 


aE, oE, ƏH; aH, 


RS oS 


Ox, ` əx’? ay Ox; ° 


so that the differences between the spatial derivatives contained in the Max- 
well equations compensate mutually for sach other, and .the derivatives with 
respect to time (with the coefficient 1/c) turn out to be quantities of a higher 
degree of smallness. 

Multiplying the inequalities: ip) we arrive at the condition of quasi- 
stationarity 


T>>Lc. era's) 


The inequality (19.5) and the equivalent inequality 


c>>L/T~v, (19.5’) 


where the quantity v ~ L/T can be interpreted as the characteristic velocity 





§19 SLOW QUASISTATIONARY MOTION 73 


of the motion of charges in the system, have an obvious meaning. Later, in 
§23, it will be shown that the velocity c contained in Maxwell’s equations 
represents the velocity of propagation of the electromagnetic field in vacuum. 
For a quasistationary motion of charges their velocities must be-small incom- 
parison with the velocity of propagation of the field. 

In this case the variation of the electromagnetic field in time is slow, so 
that the derivatives of the field with respect to time are of a higher degree of 
smallness than the spatial derivatives, and may be discarded. 

Those electromagnetic fields for which the inequality (19.5) is valid and 
in which the displacement current can be regarded as small are called quasi- 
stationary fields. 

For quasistationary fields the Maxwell equations assume the following 
form: 


vxH=#j, (19.6) 
vV-H=0, (19.7) 
VX E=0, (19.8) 
V-E=4np. (19.9) 


Thus, in the approximation of quasistationary fields the displacement cur- 
rent does not enter into the field equations. We have already seen that-the 
absence of the displacement current corresponds to the solenoidal character 
of the current lines. Conversely, if current tubes are almost closed, while the 
motion of charges takes place in a limited volume and is almost periodic, then 


the displacement current must be very small in comparison with the current 
of the moving charges. 


Maxwell’s equations turn out to be resolved into equations for indepen- 
dent fields: the magnetic field due to currents and the electric field due to 
charges. 

The charge density p in eq. (19.9) depends on time as a parameter. In the 
approximation of slowly moving charges the solution of the equations for the 
-~ electric field leads to the following obvious result: at every instant of time the 
electric field is the same as the electrostatic field of the given configuration 
of charges. 

The magnetic field of a system of charges performing a slow quasistation- 
ary motion can be found by integrating (19.6)—(19.7). We introduce the 
vector potential according to formula: (10.1). Since the time dependence of 
the charge density p and the current j can be disregarded, the strengths of the 





74 QUASISTATIONARY MAGNETIC FIELD Ch. 3 


magnetic and electric fields and, consequently, also the electromagnetic po- 
tentials do not depend on time. Hence the equations for the vector potential 
(10.6) and the Lorentz condition (10.5) assume the form 


v?A=— 27), (19.10) 


V-A=0. (19.11) 


Eq. (19.10) represents a set of three scalar equations: 
Py pened isan 
Wied =I (=x,y,2), 


each of which is Poisson’s equation. 

We shall assume that all the components of the vector potential of a sys- 
tem of charges performing a slow quasistationary motion decrease at infinity 
not more slowly than 1/r: 


A, ~ O(1/r)> 0 as r>, (19.12) 


Here the symbol O means that the discarded terms are of an order of magni- 
tude less than 1/r. The solution of eq. (19.10) satisfying the requirement 
(19.12) can be written according to formula (3.16) in the form 


y= fiO -Lf jx’, y',z') dx’ dy'dz' 
Vx —x')? +0- “YP E-z} 9, 13) 
where j(r’) is the current density at a point r’, and R = |r—r'| is the distance 


from this point to the observation point r at which the value of the vector 


potential is sought. 
It is easy to see that the solution (19.13) of eq. (19.10) satisfies fae 
additional condition (19.11). Indeed, we have 


where the divergence is taken with respect to the coordinates of the observa- 
tion point. In view of the independence of the operations of differentiation 
with respect to the coordinates r and integration with respect to the coor- 





§19 SLOW QUASISTATIONARY MOTION 75 


dinates r’ their order can be altered. The density of the current j(r’) could be 
brought outside the sign V,, but this serves no purpose. A substitution of the 
variable of differentiation according to a formula analogous to (1.17) gives 


veneee iy, (=) dV’ = npu jias, EIO 


by virtue of the condition (19.1). 
Thus, formula (19.13) gives a solution of the problem satisfying all neces- 
sary conditions. Knowing the vector potential one can find the magnetic field 





Ei jgv’ 1 jav’ 
H=~vxf k rV (19.14) 


In differentiating with respect to the coordinates r the density of the current 


j(r') must be assumed to be constant. Then according to formula (1.43) we 
find 


vxi= (v+)xj-2R 
R 


y j=. 


Hence, finally, 
IKIRE 
=> |= dqr". 19.15) 
SF ( 


Formula- (19.15) is called the Biot-Savart law. It gives, in principle, the solu- 
tion of the problem‘posed. However, the calculation of the integral in formula 
(19.15) is rather complicated and can be carried through only for the simplest 
systems. 

In the next paragraph formula (19.15) is applied to find the field of a 
single charge moving in vacuum. However, it is particularly important for the 
, calculation of the fields of currents flowing in conductors. Hence further 
examples of calculations by means of the Biot-Savart law will be analyzed in 
§ 17 of Part IV. ; ' 

It should be stressed that all the results of this paragraph, in particular the , 
Biot-Savart law, are of an approximate character. They are a consequence of 
the relations (19.5).-Quasistationary fields are encountered particularly often 
in considering electromagnetic processes in material media. Hence. we shall 
return to them in Part IV (§22), where the conditions under which a field 
can be assumed to be quasistationary will be discussed in more detail. 





16 QUASISTATIONARY MAGNETIC FIELD Ch. 3 

In conélusion we present a relation which we shall need later. The dimen- 
sions of that region of space in which the action of a magnetic field on a 
system is considered are very often sufficiently small that in the limits of 
this region the magnetic field can be assumed to be constant and uniform. 
Then the vector potential A of this constant uniform field can be written in 
the form 


A=5HXr. (19.16) 


One can verify the validity of this relation by a direct calculation according 
to formula (1.45). 


§20. The field of a point charge undergoing a slow uniform motion 


_ Let us consider a point charge e moving with a constant velocity Vo 
(fig. 1.8). We assume the absolute value of the velocity |vp| to be small in 
comparison with the characteristic velocity c occurring in Maxwell’s equa- 
tions (see further §23). 

The charge density p can be written in the form 


p(r', t) = e8(r'— ro(t)) . (20.1) 


Here ro(t) is the coordinate of the charge at an instant ¢. Formula (20.1) 
means that at any given moment the charge is present at a point of space with 
the coordinate rp(¢). Maxwell’s equations for the electric field have the form 


VX E=0, 


V-E= 4ne8(r — ro(¢)). 


Volt) 


§20 SLOW UNIFORM MOTION OF A POINT CHARGE 17 
Since the field is irrotational, a potential y(r, ¢) can be introduced, so that 
E=—- Vy. 
The potential satisfies the equation 
Vp =— 4ne8(r —r(2)). 


The solution of the equation can, according to (15.1), be written in the 
form 





pdav’ d(r'—ro(t)) 
= =e | ————_ dV 
K ese J Ir—r'l 
ee 

Ir—ro()l R 





oe (20.2) 


Here R(r) stands for a vector connecting the point of observation N with the 
instantaneous charge coordinate ro(f). 

From formula (20.2) it follows that the electric field of the moving charge 
is formally the same as that of a charge at rest, but, instead of the fixed dis- 
tance from the given point of observation to the charge, the time-dependent 
distance R(t) figures in (20.2). The field of the charge is obviously given by 
the formula 

E ERO (20.3) 
R? (t) 


Since the charge is moving uniformly, its position in space can be written 
in the form 


Yo =Vol- 


Hence the electric field strength at a point will depend on time according to 
the law 


i e(r — Yot) 
ECNO = ee (20.4) 

Ir—votl? 
It is obvious that at a certain point r at time ¢ the field strength will be the 
same as that at a point with the coordinate r+ vọ at time t+ 1. Indeed, 


78 QUASISTATIONARY MAGNETIC FIELD Ch. 3 


E(r + v5. + POED bee, o. 


lr+vo—VYo(t + 1)13 


This means that the point with given value of field strength moves in space 
together with the charge. The field retains spherical symmetry with respect to 
the instantaneous position of the charge. Later we shall compare formula 
(20.4) with the corresponding expression for the field produced by a charge 
moving with a velocity lvl =c. 

Let us now pass on to the determination of the magnetic field. It can be 
found by means of formula (19.15), in which the current density for unit 
charge can be written in the form 


j=evgêlr-—ro)- 


Substituting the value of j into (19.15), we find 





e p[vox(r—r')] 5(r’— 45 €voXR 1 
pices [OS AEOS 0 =-v)XE. (20.5) 
c Ir-r'l? c R c 


Thus, the magnetic field turns out to be perpendicular to the electric field 
and to the velocity of the charge. The absolute value |H| ~ (vọo/c) IEI, i.e. 
the absolute value of the magnetic field, is small in comparison with the 
electric field by the ratio vg/c. Differentiating according to formula (1.45) 
one can see that the vector H satisfies eq. (19.7). 

In §9 we have discussed, without calculation, the problem of finding the 
field of a moving charge. We have shown that if the integration surface So 
(fig. 4) passes through the point at which the moving charge is present at a 
given instant, the magnetic field is connected with the charge current j. This 
picture corresponds to the calculation we have performed. 

Formula (20.5) has been obtained as a result of solving eqs. (19.6) and 
(19.7), in which the displacement current was absent. However, it is possible 
to find the value of the magnetic field from the displacement current on an 
arbitrary surface S}, without making use of formula (19.15). 

We write the equations for the magnetic field in the form 


-13E 41 
vx H= TAA evoôlr-—ro), (20.6) 


V-H=0. (20.7) 


§21 FIELD AT LARGE DISTANCE 719 


The electric field E of the uniformly moving charge depends on time accord- 
ing to formula (20.4). Differentiating E with respect to time, we find 


aE _ (= ,9E, ,2E 


=— oS + oo = va VE. 
ar ax Ux * ay Yy * az vs) ioe VB 


From formula (1.45) we have 
VX (vo X E) =(E- V)vo—(Yo' V) E+ vo(Y -E)—-E(Vv Vo) = 
=— (vo V)E+tv(Y-E), 
since Vg is a constant vector. Whence we find 


SE =_(vy-V)E=V X (Vo X E) —V(V-E)= 


=VX (Vo X E) —4nvy,ed(r—ro) , (20.8) 


by virtue of (20.1). 
Substituting (20.8) into (20.6), we have 


VX H=1 VX (vy XE). (20.9) 


Expression (20.5) provides the solution of (20.9) satisfying (20.7). Thus we 
see that the two methods of calculation lead to the same result, as was to be 
expected. 


§21. The field of a system of charges undergoing quasistationary motion at 
a large distance from the system 


Let-us assume that a certain set of charges performs a slow and quasi- 
stationary motion ‘in a limited region of space. The electromagnetic field of 
this system at a distance large in comparison with the dimensions of the 
system (dimensions of the region of motion) is often of great interest. Here, 
as well as in the case of electrostatics, the general formula for the veckgt ` 
potential (19.13) allows us to make a substantial simplification. 

We make use of the expansion (15.8) writing 


80 QUASISTATIONARY MAGNETIC FIELD Ch. 3 


eT ' 1 
j=l- (r vi). 


Substituting 1/R into (19.13), we find 


MWe Lp )\ a 
Aq fiav = i(r vijay’. 


By virtue of (19.4) for a system performing stationary motion the first inte- 
gral is equal to zero. Hence 


s-i fi(r-vt)ar’. (21.1) 


We transform the integrand in (21.1) by means of the identity 
i U ib =l i Wo Il LA A 1 
i(' vi)=+li( vi) r'( vi)}+ 
1i fr-v}\+r (jyt 
“lif vi)er(i-v2)). 


The first bracketed pair of terms can be presented in the form of the triple 
vector product: 


i(v-v4)-«(-v4)}- (XX =a AD 
r r r r? 
Thus the vector potential can be written in the form 


1 ¢@'xj)XrdVv ı (e W EE 1) ; 
A= — | ———— -v—)tr'(j-v—) av’. 
a r? 2 fii à J (i us 


c 


We go on to the calculation of the second integral. Introducing current tubes, 
according to the general theory of §19, we can write 


jdV' =d/dl=d/ar'. 


Indeed, the change of the charge position dr’ in its motion along the current 
tube is identical with dl. Hence 





§21 FIELD AT LARGE DISTANCE 81 


=fli(e-v})+e (v3) jar’ 
= farg {ar (r-v2) 0 («-v2)}- 
= farga (r(r-vt))=0, 


since the integral over a closed contour of the total differential is always 
equal to zero. 
Thus, finally, 


1 eel ae 
A=— | (r' Xj) XrdV' = — J (xj) dv Xr. (21.2) 
Eau i) zai. “ 


We introduce the notation 
a ] + . + 
=5,/t XjdaV’. (21.3) 


The quantity M is called the magnetic moment of the system of charges. It 
depends solely on the properties of the charge system — the current density 
distribution and the geometry of the system. We shall see further that M is 
indeed, to a certain degree, an analogue of the dipole moment of a system of 
charges at rest. 


The vector potential at a large distance from the system assumes the form 


MXr 


A= 3 s (21 .4) 





The magnetic field is expressed, according to (1.45), by the formula 


MXr_ +) exer 
x =M(v TEMO 


r 


H=VXA=VX 


Since the differentiation is carried out with respect to the coordinates of the 
observation point, the magnetic moment of the system is constant in the 
differentiati_ 


According to (1.42) we find 


82 QUASISTATIONARY MAGNETIC FIELD Ch. 3 


lay, #1 BRNS 


r 
-t=y-V—+—V-r=—-—+—=0. 
1 ey a 
Further, 
r_1 1 M 3r(M-r) 
M-¥ $24 My) rtr(M-v<) -M 3M 
( 5 r3 a ES r5 
Hence 


He Mer) = 72M (21.5) 


r? 


We see that the magnetic field at a large distance from a system of charges 
in slow and quasistationary motion, is expressed by the same formula as the 
electrostatic field of a system of charges at rest. The difference consists of the 
fact that, instead of the dipole moment d in (15.18), the magnetic moment M 
of the system occurs in formula (21.5). 


§22. The magnetic moment 


We shall consider in somewhat more detail the properties of the magnetic 
moment of a system. First of all it is easy to verify the fact that the value of 
the magnetic moment does not depend on the choice of the origin. Shifting 
the origin by a constant vector a, i.e. assuming r' =r" + a, we find 


M= = fr'xjav'=> fr'xjav'+ 3 faxjav’. 


Rewriting the additional term in the form 


faxjav'=ax jay’, 


we see by virtue of (19.4) that it is equal to zero. 

- Thus, the magnetic moment, like the dipole moment of a neutral system 
of charges, represents a quantity which depends only on the physical proper- 
ties of the system but not on the choice of the origin. 

Consider the expression of the magnetic moment in the case when charges 





§22 THE MAGNETIC MOMENT 83 





are moving along a thread or a tube (fig. I.9). Making use of (19.3), we find 


TN 


apt g == o 2 
=>; fare’x dl 5 fexa. (22.1) 


It is easy to see that the quantity 4 r'X dl represents an area vector. The inte- 
gral S=f4 r'X dl represents the area of the lateral surface of a cone resting 
on the path of the current. In the particular case of a flat closed path, one 
can choose as S the vector of the normal to the plane of the path multiplied 
by the area of the plane. Then 


-1S 


M z (22.2) 


Formula (22.2) yields to an obvious interpretation: every closed current (for 
example, one or several charged particles moving along closed trajectories) 
possesses a magnetic moment proportional to the value of the current. In this 
connection we note that every atom with electrons rotating in an orbit is an 
elementary magnet (see Part IV). We now rewrite the expression for the mag- 


netic moment by specifying the current density in terms of the velocity of 
motion of the charges: 


pile , paal , 
M= 5 fir X pv dV' = = 2D r; X ev}, (22.3) 


where the summation is carried out over all the charges in the system. 


Consider the important case of a system consisting of identical or different 
particles having the same value of the ratio of the charge e to the mass m. In- 
troducing this relation into (22.3), we find 


d 
| 84 QUASISTATIONARY MAGNETIC FIELD Ch. 3 
| L, (22.4) 


ei 
M=- sr; X m;Yi -Dr ED ore 


where Lis the angular momentum of the system. 

Fcrmula (22.4) shows that for a system of particles with a fixed value of 
e/m there exists a direct proportionality between the magnetic moment and 
the angular momentum of the system. 

The proportionality between M and Lholds also for a system consisting of 
two particles with an arbitrary ratio e/m. Indeed, in such a system 


ata 


a} ' ' 
M=>- [e r] Xv, te, r, X v3]. 
Locating the origin at the centre of mass, i.e. assuming 
, p praa 
mrj +m, = 0, 


and introducing the relative velocity 


— oO) — 20 
rela? a rel? 
we find 
an 47) 
Vina, v 
2 rem +m’ 1 rel m; +m 
2 
uf ey pele) lye, e 
m=] £ Xy Xy jE pales) 
TAZ rel rel * m2 Tel rel 2c DD 
1 1 2 
where 


_ + 
L= Tel X Prel = Tel X UV rel 


is the angular momentum of the relative motion, and u = m;m3/(m] + mM) is 
the reduced mass. 








The Electromagnetic Field 


of Arbitrarily Moving Charges 


§23. The electromagnetic field’of a system of arbitrarily moving charges 


Let us consider a system of charges performing an arbitrary motion in a 
certain volume V’. The distribution and motion of charges in this volume will 
be characterized by the charge density p(r, t) and the current density j(r, t), 
varying in space and time. We shall assume that the functions p(r,f) and 
j(r, £ĉ) are known for all times (i.e. for — < t < %9). 

Equations for the space-dependent and time-dependent electromagnetic 
potentials are of the form 


2 
ama CRD EPS (23.1) 
c? ar? 
2 
VAC Oe E ERAON (23.2) 
c2 ar? c 
V-A(r, 1) +2 eet) 0. (23.3) 


The system of eqs. (23.1)—(23.3) represents a system of linear partial differ- 
ential equations. As is known from the theory of partial differential equations 


85 


_ 


86 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


a single-value solution of the problem — in this case finding the particular dis- 
tribution of the electromagnetic field in space and time as a function of the 
values of the known functions p(r, t) and j(r, t) — requires that some further 
conditions, called boundary and initial conditions, should be given. 

As a rule, the problem of finding the electromagnetic field is defined as 
follows: before a certain time f= 0 (i.e. for all £ < 0) the charges of the sys- 
tem were at rest; at time t = O the charges are set in an arbitrary motion. A 
change or perturbation then arises in the electromagnetic field. We shall 
assume that eqs. (23.1)—(23.3) involve the vector and scalar potentials of just 
this-perturbed field. The functions p(r, £) and j(r, t), responsible for the’ per- 
turbation of the field for t>0O, are considered as known. For t<O one 
sets 


p(r,0)=0, j(r,0)=0. 


Correspondixgly, at the initial time ¢=0 the vectors of the electric and 
magnetic field are equal to zero: E(r, 0) = H(r, 0) = 0. Then the initial condi- 
tions for the potentials read 





A(r, D|,=0 =0, 

aA, D)  _ 

Bears 0, (23.4) 
v(t, ‘| 1=070- 


Indeed, if the conditions (23.4) are satisfied, then from the definition of the 
field vectors it is seen directly that the field vectors are zero. Thus, the initial 
condition (23.4) is completely equivalent to the requirement that the densi- 
ties of the charge and current should be equal to zero for t = 0 and, in general, 
different from zero for f > 0. 

The requirements that at a large distance from the volume V” the field 
potentials should decrease not more slowly than the law 


yg~O(i/r) as r>%, 0<t<e, 
(23.5) 
IAI~O(1/) as r>%, 0<t<™, 


i.e. not more slowly than the function 1/r, will serve as the boundary condi- 
tions. 





§ 23 ARBITRARILY MOVING CHARGES 87 


In solving the system of field equations we shall, in this paragraph, make 
use of a simple but not exact method, based on the use of the principle of 
superposition. In the next paragraph we shall present a method which is more 
consistent from the mathematical standpoint and which leads to the same 
results. 

We divide the whole system into a set of arbitrarily small charges 5e; = 
p(r, t)5V;, where ôV; is an arbitrarily small volume within the region V' We 
seek the potential of the field produced by the charge 5e; at a certain point 
of observation N, assuming that there are no other charges in the space. The 
total field, on the basis of the principle of superposition, represents the sum 
of fields produced by all the charges ôe; constituting the system. 

It should be emphasized that the charge conservation law does not allow 
the existence of a time-dependent and solitary charge ôe;. In reality a change 
(for example, an increase) of the charge ôe; in time assumes a simultaneous 
decrease of a charge 5e, in another element of the volume, so that the total 
charge of the system is conserved. However, in finding the potentials of the 
field produced by the charge ôe; we shall not formally take into account the 
existence of other charges. This apparent violation of the charge conservation 
law will not affect the final solution, in which the summation over all charges 
of the system will be performed. 

Let us first of all find the solution of the system of equations for the 
potentials of the field produced in the entire space outside the small volume 
ôV; by the charge ôe;. 

At all points of this space outside the volume ôV; the charge density, 
according to our assumption, is equal to zero. Hence outside the volume 6 V; 
the equations for the electromagnetic field potentials assume the form 


1 ð? 
io- = S24 =@. 
c2 ðt? 
2 
PaA AN, (23.6) 
c2 ðt? 
v:-A+ 2-0 
c ðt 


We introduce spherical coordinates with the origin located in the volume 
5V;. The field outside the volume ôV; possesses spherical symmetry, so that 
the field potentials can depend only om the distance from the volume ôV; — 
the radius vector r — and time. From the expression for the Laplacian in 



















88 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


spherical coordinates, we have 


a 


i 1 ð? 1 02y_ 
| l A2 Daa (23.7) 
i 2 2 
1 ð 107A _ 
F te ae (23.8) 


We see that the scalar potential and all the components of the vector po- 
tential are determined by an equation of one type: 


1 3? 1 df _ 
08 23.9 
r ar? c? ðt? ae 


For reasons which will be clear from what follows, an equation of the type 

(23.9) is called a wave equation. The integration of the wave equation can be 
= carried out most simply by D’Alembert’s method. D’Alembert’s method con- 

sists, roughly speaking, of the reduction of a partial differential equation of 

the type (23.9) to an equation with a mixed second derivative (for a more 

strict presentation of D’Alembert’s method we refer the reader to mathemati- 

cal texts *. 

We rewrite eq. (23.9) in the form 


2 
5-40- 0, (23.10) 


and introduce a new unknown function 


y=rf. (23.11) 


This is always possible since r #0 outside the volume 5V;. We then have 


2e a A a —1()) (23.12) 


* See, for example, A. N. Tikhonov and A. A. Samarskii, Partial differential equations 
of mathematical physics (Holden Day, San Francisco, 1964) Ch. II. 





§23 ARBITRARILY MOVING CHARGES 89 


Now we change over to new variables in eq. (23.12): 





r 
cat (23.13) 
r 
n=tt-. (23.14) 
c 
Whence 
= Stl ae 
posao Ss (Olen 
so that 
Ope OnOp edn, OF RO 
ð drat arate 2 dr car)’ 
9.9: , 9 Ot _1,(/9,19 
an or dn ato be (2+12) 
Further 


jaza a -(2+42)(2-12) AOR Oe DATA 
ar? c2?2at? \ar_ ~cat/\ar car c2 aban ce? EIN 
Eq. (23.12) in new variables has the form 


a2y _ 
Sans OF (23.15) 


This equation, containing only the mixed derivative, is integrated directly. It 
is obvious that the equation is satisfied by any functions y} (n) and y(&) of 


one variable, £ or n. Hence the general solution of eq. (23.15) can be written 
in the form 


VE n) =YV E + Ym), (23.16) 


where Y} and y, are arbitrary functions of one variable ¢ and 7 respectively. 
Returning to the previous variables r and f, we obtain 


vr, D= Vy (:-Z)+¥, («+2). (23.17) 





90 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


This solution has a simple meaning. The value of the function y} at the 
point (r +c) at time (¢ + 1) is the same as at the point r at time ¢. This means 
that ,(t—(r/c)) describes a process which is periodic in space and time, i.e. 
a wave process. The wave propagates in the direction of increasing values of 
the distance r from the origin with a velocity equal to c. Similarly y(t + (r/c)) 
describes a wave propagating from large r’s to smaller ones in the direction 
toward the origin. For the function f we have 


v(i) vo (+2) 


r r 





f (23.18) 


Formula (23.18), giving the general solution of eq- (23.9), represents the 
superposition of two waves, the one diverging from the origin (the first term) 
and the other converging to the origin (the second term). The surfaces of the 
spheres 7 = const are the surfaces of constant value of the function f or equal- 
phase surfaces. Since equal-phase surfaces are spherical surfaces, it is said that 
the wave process described by the function f is the combination of a diverging 
and a converging spherical wave. The scalar potential y and all components of 
the vector potential A can be presented in the form of formula (23.18). 

In order to make clear the meaning of the solutions obtained, we shall 
consider one of the particular solutions, for instance the diverging spherical 
wave. For concreteness we write the expression for the scalar potential: 


r 
Pi B 


r 


yr, t) = (23.19) 


_ For an arbitrary form of the function y, formula (23.19) gives the particular 

solution of eq. (23.1) in the region of space outside the volume 5V;. We now 
require that (23.19) should pass over continuously into the solution of eq. 
(23.1) near the volume 5V;, i.e. near the locus of the charge 5e,(¢). If in eq. 
(23.1) a formal transition is made by assuming c > co (the meaning of such a 
transition will hecome clear from what follows), then it will obviously turn 
into ‘the equation for the electrostatic potential, the solution of which is 


_ Bet) plr, 1)5V; 


ig = = (23.20) 


Writing (23.19) in-the form 






§23 ARBITRARILY MOVING CHARGES 91 


o(:-Z)ay, 
dy,(r, t)= a gnats (23.21) 
we arrive at the potential of the field produced by the charge 5e;, which 
satisfies eq. (23.7) outside the volume ôV; and turns into (23.20) near the 
origin. 

Formula (23.21) shows that the field potential at the observation point a 
distance r from the origin at time ¢ is determined by the value of the charge 
at the preceding instant r = t—(r/c). The potential (23.21) is therefore called 
retarded potential, while the quantity r/c is called the delay time. The delay 
time represents the time interval during which the electromagnetic field, 
propagating with the velocity c, traverses the path r. 

Introducing the origin at point O, located in the volume V’, and integrat- 
ing the expression (23.21) over all charges of the system, we arrive at the 
following expression for the potential of the field produced at the point of 
observation N: 








ries eel) ay 
or, À =f ea aries ae 
, _ ke ' A A 
pee zd = [ae ay ; (23.22) 


where 7 = t — (R/c), R= r —r'. 

According to (23.22), to obtain the potential at the point of observation 
N, it is necessary as in electrostatics, to evaluate the integral of the quantity 
p/R over the whole volume of the system.'In this case, however, the value of 
the charge density is taken at time 7 =f — |r —xr'|/c, where the delay time 
lr—r'l/c is determined by the distance from every point r' to the observation 
point. 

Similarly, for the vector potential the particular solution of eq. (23.2) can 
be written in the form 


i (rr- lr—r -Jav 
Az=i G = 
c R 
. + R , 
——)dV 5 
i ife- _1fjr,r)dy' 
=<f R = f ee (23.23) 








92 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


In addition to the solutions of the equations for potentials in the form of 
retarded potentials, one can also write other particular solutions correspond- 
ing to the function W, in the general solution (23.18): 


+ R r 
k p(r,r+%) dV 
y*(r, =f R 5 (23.24) 
alta 2) ; 
$ i i(t dV 
A*(r, )=>f a (23.25) 


In formulae (23.24) and (23.25) the values of the functions p and j, deter- 
mining the potentials at the point r at time ż, are taken at the instant 7* = 
t+(R/c). This means that the potentials at the instant t depend on that 
charge density which will appear in the point r after the time interval R/c. 
The potentials (23.24) and (23.25) are called advanced potentials. 

From retarded and advanced potentials one can make arbitrary linear 
combinations of the form ajy + a,y*, B} A+6,A*, which also satisfy the 
field equations. The general solution of the equations for the potentials is 
obtained from the particular solutions and general solutions found for eqs. 
(23.7) and (23.8). 

The appearance of retarded and advanced potentials as equally valid solu- 
tions of the field equations is quite natural. Like the equations of mechanics, 
the electrodynamical equations are symmetric with respect to the future and 
the past. They do not change when ż is substituted for (— f) and, hence, must 
have a general solution invariant under the change of the sign of time. 

The choice of coefficients in these linear combinations is determined by 
giving the above mentioned supplementary conditions characterizing the be- 
haviour of potentials at infinity. In order that these conditions may be ful- 
filled it is necessary to reject the solution corresponding to the advanced 
potentials. Indeed, let us consider at the moment t= 0 a certain sphere of 
radius R} outside the volume V’. According to the boundary condition (23.5), 
the potential y(R,, 0) ~ O(1/R,). Correspondingly, the retarded potential 


Toll 
WR) = feo dV’ R 


since according to the condition the charge density is equal to zero for t< 0. 
On the contrary, the advanced potential 





§ 23 ARBITRARILY MOVING CHARGES 93 
p(t’, R/c) ’ 
y*(R,, 0) =|; av #0(1/R;)- 


In the last formula there corresponds to time ¢= 0, a non-zero value of the 
argument of the function p(r’,7), because the charge. density p(t’, R/c) is 
not equal to zero. 

It should be noted that the properties of the charge density p(r’, 0) = 0, 
p(r’, t—(R/c)) # 0, used here, correspond to the choice of initial conditions 
in the form of (23.4). We see that, for such properties of the density p, the 
advanced potential does not satisfy the condition (23.5) determining its be- 
haviour at infinity. Namely, the advanced potential decreases more slowly 
than the function 1/r. In order to obtain a solution of the wave equation 
which satisfies the system of initial and boundary conditions, one has to 
assume that œ, = 6 = O and retain only the’solution of the field equations in 
the form of retarded potentials. Uncomplicated, but somewhat cumbersome 
calculations allow one to verify the fact that the retarded potentials given by 
formulae (23.24) and (23.25) satisfy the Lorentz condition (23.3) for a, = 
By =l. 

Thus, we arrive at the very important conclusion: a system of charges, 
which at the moment ¢ = 0 begin to perform a non-steady state motion, pro- 
duce in the surrounding space an electromagnetic field whose potentials have 
the character of retarded potentials. The field pòtentials have the form of 
spherical waves originating from the system and propagating in vacuum with 
the velocity c. 

We shall say that a system of non-steady state moving charges emits elec- 
tromagnetic waves, and call it briefly the emitter. 

The solution of the electromagnetic field equations in the form of retarded 
potentials is of great significance. It corresponds to a definite concept about 
the character of causal relationship, which differs from the concepts of classi- 
cal mechanics. 

As is well-known, all the propositions of classical mechanics are in accord- 
ance with the Newtonian concept of action at a distance. In classical mechan- 
ics it is assumed that the acceleration of a material point at a given instant is 
completely determined by the force acting on it at that instant. The force 
acting on a given material point depends in turn on the position of other 
material points, located at a finite distance from the observed point. If the 
position of any of the material points is changed at a certain instant to, then 
the magnitude of the force will be changed also at the same instant. In other’ 
words, the velocity of propagation of the interaction in space is considered in 
classical mechanics to be infinitely large. 


94 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


In the theory of the electromagnetic field the situation is radically altered. 
If the position of the charges located at a distance r from the point of obser- 
vation is changed, then the potential at the latter point will change only after 
the time 7 =r/c. This time is needed for the perturbation of the electromag- 
netic field, moving from point to point with a finite velocity equal to the 
velocity of light c, to traverse in space the path r. The perturbation is trans- 
ferred from one point of the field to the neighbouring one. The. space in 
which the propagation of electromagnetic perturbations occurs is no more 
the empty “nothing” of classical mechanics, but is considered to be filled by 
a real electromagnetic field endowed with definite physical properties. Thus, 
in the theory of the electromagnetic field ‘the infinitely large velocity of 
propagation of interactions and the long-range action of classical mechanics 
are replaced by a finite velocity of propagation of interaction and the con- 
cept of short-range action. The cause (change of the field) and the effect 
‘motion of the test charge at the observation point) refer to one location and 

ne instant of time. In the next part of the book the standpoint of field 
heory will receive further confirmation and extension. 

It should be noted that, because the velocity c of propagation of the elec- 
tromagnetic perturbations is very large, it can often in practice be considered 
as infinitely large. The concepts of classical mechanics are by this fact not 
simply discarded as untrue, but are retained as approximate ones, having a 
well defined region of application. 


§ 24*. General solution of D’Alembert’s equation in the form 
of retarded potentials 


We now pass on to the exact solution of D’Alembert’s equation for the 
potentials. We confine ourselves to finding one of the potentials, for instance 
the vector potential. The expression for the scalar potential can be written 
_by analogy. 

In accordance with what was said in the preceding paragraph, the equation 
for the potential 


it 62 
104A_ 4n. (24.1) 
must be supplemented by initial and boundary conditions. 


: We consider the problem in the following formulation. Before a certain 
time t= 0 (i.e. for t <0) let there, for example, be 2 system of charges, at 





§24 D’ALEMBERT’S EQUATION 95 


rest or performing a steady state motion. Before the time ¢ = O their configu- 
ration is given and the field has a certain distribution in space. At the moment 
t=0 the state of the system is changed and the charges begin a non-steady 
state motion. We are interested in the law which gives the change of the field 
vectors, connected with this non-steady state motion of the charges, from the 
time t = 0 on. 

Let E and H be field vectors associated with the motion of the charges for 
a time t > 0. In other words, let E and H be the changes in the electromag- 
netic field which existed at the moment ¢ = 0, caused by the motion of the 
charges for t > 0. Then from the definition of E and H it follows that at the 
initial moment ¢ = 0 they must satisfy the initial condition: 


E=H=0 at ż=0 (overall space) . 


In this case the initial condition for the vector poteritial has the following 
form: 


A=0 at t=0, for arbitraryr, (24.2) 
oA. O at ¢=0, forarbitraryr. (24.3) 


The first of these equalities corresponds to the equality H= 0, while the sec- 
ond one corresponds to the absence of the electric field, E= 0, at the initial 
moment. The boundary condition at infinity has the form 


a>o(i) as r>, t>0. (24.4) 


In other words, we understand A to be the vector potential of the field 
associated with the motion of charges beginning at the time ¢ = 0 and taking 
place for t > 0. 

To obtain the solution of this problem — i.e. finding the solution of eq. 
(24.1) satisfying the system of conditions (24.2)—(24.4) (called in mathe- 
matical physics the Cauchy problem) — it is most convenient to make use of 
the method of Fourier integrals *. We write the vector potential and current 


*R.Courant and D. Hilbert, Methods of mathematical physics (Interscience Pub- 
lishers, New York, 1953); í 
A.A. Vlasov, Makroskopicheskaya elektrodinamika (Macroscopic electrodynamics) (Gos- 
tekhizdat, Moscow, 1955) p. 166. 





96 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


density in the form of the triple Fourier integrals *- 


aoe ain tak the iK dk, (24.5) 
j= r JP nek tdk. (24.6) 


The Fourier transformations of the vector potential and current density are 
then given by the inversion formulae: 


a(k, t) = Qr oe JAG. Neidr, (24.7) 
p(k, t) = — zg J ir, 1) elk dr. (24.8) 
(2 5 (2m)? 
Substituting (24.5) and (24.6) into (24.1) we find 
j 
2 
a +kc2a=4nep. (24.9) 


Eq. (24.9) represents an equation in total derivatives, and its general solu- 
tion can be found easily. Substitution of-(24.7) into the initial conditions 
(24.2) and (24.3) gives: 


a=0 at t=0, (24.10) 
da 3 
ar O at t=O. (24.11) 


The solution of eq. (24.9) with the initial conditions (24.10) and (24.11) 
reads **; 


t 
a= “ fr (£) sin [ke(t—£)] dé. (24.12) 
0 


* In this section dr stands for the volume element dx dy dz and dk for dk x dky dk,. 
** V. I. Smirnov, A course of higher mathematics (Pergamon, Oxford, 1964). 


§ 24 D’ALEMBERT’S EQUATION 97 


To obtain the function A this value of a must be substituted into (24.5). 
We then have 


Ge aa Sf p(é) sin {ke(t —£)} eK tS (24.13 


The current density j should be introduced instead of the Fourier transform p 
in (24.13). By means of (24.8) we find 


oer mik: ale elk 1 j(r', £) sin {ke(t —£)} dr’ dé. 
(24.14) 


Integration in formula (24.14) must be carried out with respect to the 
variables £ and k. 
We change the order of integration, writing 


Aas, Ser fie £) dg feik: 0- ry infet- Dhik, i 
(24.15) 


We calculate the inner integral and have 


fe Et sin {ke(t — £)} dk = 
k 
E sin uE Kk? dk sin 0.d9 dy = 
Coles m 
= an f sa {kt 8) 72 dk f e7 ik lr—r' Icos ø sin 0 dé . 
0 0 


But 


pg +1 
f e` ik lr—r'lcos 6 mA =f e`iklr-r'lu EE 
0 -1 


7 sin {klr—r I} 


klr—r'l 


EE Sees, | TEE EG i 


98 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


Hence 


—ik-(r-r') Sin {ke(t — £)} E 
fe ik-( ) a dk 


co 








= H f sin {ke(t — 8} sin {klr-r'I} dk = sus Sy 
Ir—r'lg Ir—r'| 
where / stands for the integral 
isp sin {ke(t — £)} sin {klr—r'l} dk = 
0 
co co 
=f sin ak sin Bk dk = 3 f cos (a—B)k dk — 
0 0 
-3 cos(a + B)k dk = +1 5(a—f)— 3.1 Sla + p) = 
0 


=} rôf- —Ir-4'l] — Fn 8[o(t—#) + Ir—r'l] . 


In calculating J we have made use of one of the determinations of the delta 
function given in Appendix III. 
Finally we find 


feie a= sin {ke(t — £)} EE 
k 


2n2 
Ir—r'l 





{ô [et —£) — Ir- r'l]— ô [e(t — £) + lr—r'l]}. 


-Substituting this expression into (24.15), we obtain 





§24 D’ALEMBERT’S EQUATION 99 





, ts 
a= f = fice. see-9-1r-r) ae + 


Ir 





, t 
Sa oeeo + Neel] ae. 


We seek to perform the integration with respect to the variable £, making 
use of the properties of the delta function. 
In the first integral: 


t 
Jf iG, 8 [e(t—8)— Ir—r'l] de = 
0 





--if j(r,—£+1- EZE) su) au = 
ct- lr-r'l 

SK] EE Ir—r'l ii 

Si) j(r.-4+e- 5 ) 20) au = 





(r y eet) 
J\r, a : 


We have chosen u = c(t —£) —|r—r'| as a new variable. 

If the inequality t> |r—r'l/c holds, then the range of integration can be 
extended, as has been done here. If £ < |r—r'l/c, then the integration would 
be carried out only with respect to negative values of the variable u (both 
limits would be negative). The point u = 0, at which (u) becomes infinite, 
would lie outside the range of integration, and by virtue of (III.3’) the inte- 
gral would be reduced to zero. This refers, in particular, to the time ¢=0, 


which corresponds to the fulfillment of the initial condition (24.2). Similarly 
the second integral gives 


1 
c 


100 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 
t 
JIC EDS- + Ir—r'l] dé = 
0 


Ir—-r'l 7% 
2=7 i(r.—Gte+ 
crt Ir—r'l 


Ir—r'l 


A ) 8w au =o 


where in this case the point u = 0 turns out to-be outside the range of inte- 


gration. 
Thus we have obtained 


i(e zt bez] 
A(r, )=+f rar. (24.16) 
G 


Ir—r'l 


i.e. the expression already known for the retarded potential. 

The integral with the second delta function corresponded to the advanced 
potential. The advanced potential gave no contribution to the vector potential 
only because, in obtaining the solution (24.12) of eq. (24.9), we made use of 
the initial conditions (24.10) and (24.11), which determined the form of 
a(k, t). 

If we wrote the general solution of eq. (24.9) without giving initial condi- 
tions, then the final expression for the potential would involve a linear com- 
bination of the retarded and advanced potentials, as well as the general solu- 
tion of the homogeneous equation. 

Later we shall need an expression for the potential in the special case when 
the dependence of j(r, £) on time is expressed by a simple harmonic law: 


j(r, ) = jor el! (24.17) 


Substitution into (24.16) gives 


-i2 Ir-r'l 
c 


A(r, t) =A gins fe o 
c 


; dr’ = AoC)". (24.18) 
r—r'l 


As is to be expected, the vector potential depends on time in the same way 
as the current. The amplitude Ao(r) is obviously equal to 











§ 24 D’ALEMBERT’S EQUATION 101 
jo) —i Zlr=r'l 
1 pjJo@ Je : 7 
A,(r) = ; dr (24.19) 
o€) A r—r'l 


On the other hand, substituting (24.17) and (24.18) into (24.1), we have 
Wook ant (24.20) 
0 2 0 2 Jo x é 


Thus, formula (24.19) gives the solution of eq. (24.20) which we shall en- 
counter later. 

In conclusion it should be noted that the solution of Poisson’s equation 
represents a particular case of the problem. Assuming in (24.20) and (24.19) 
that w = O we arrive at the relations: 


ane. 
V2A,=— tjo, (24.21) 
r’) dr’ 
p= fee es (24.22) 
Ir—r'] 


The expression (24.22) can also be obtained by a direct solution of 
(24.21) by Fourier integral expansion. Indeed, assuming that 


1 offen 
M0 = aif Je? ag(k)dk, (24.23) 


ae an’ 
Jo= (27) fe ad Pok) dk, (24.24) 


and substituting (24.23) and (24.24) into (24.21), we have 





S 4n Po 
Oe ay (24.25) 
Consequently, 
1 : 4np 
A =——_—_ ata 2) 
0 omens ck2 dk. (24.26) 


102 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


By means of the inversion formula we obtain from (24.24) - 


= 1 Tong 
Po = (am)??? Sek Tig(t) ar, 


Substitution of this expression into (24.26) leads to the formula 


= 4n + fjol ae fom ae 








Gap e 
ie ~iklr— 
= 59 Jio) ar fis iklr r'lcosa sin @ d dy dk = 
2 c sin {kl r—r'l} 1 pjo(r’) dr’ 
== [mya || = eos 
dio) J klr—r'l J Ir—r'l 
where we have made use of the equality 
fgets Day f Bau sisi 
klr—-r'l Ir— af 0 2lr—r'l 


§ 25*. The field of a point charge moving arbitrarily 
The case.of a single point charge moving arbitrarily is an important exam- 
ple of the application of the formula for retarded potentials. If the velocity 


of the charge is equal to vo(f), then its coordinate is rg = f Vg dt. The charge 
density and current density can be written in the form 


p=e8(r'—ty), 


j=pVy =e ôlr'— ro) V(t) - 


Then the following expressions are obtained for the retarded potentials: 





§25 * ARBITRARILY MOVING POINT CHARGE 103 














Xe = dV’ 
dep Soe 
c Ir—r'l 
( f Ir—r'l Ir-r] s 
-£f trohi- 7 JJl: z Jav 
c Ir—r'l 2 
(25.1) 
8(r’ —rp(r)) dV’ 
lr—r'l ; 
a(r -ro(- =") )av 
=ef TE : (25.2) 


Since rg in the integrand of formulae. (25.1) and (25.2) is a function of the 
delay time ¢ —|r—r'l/c, one cannot make use of the property of the delta- 
function directly and assume r’ = Ig. In order to carry out the integration we 
introduce new variables: 


=X), i ey SK WO A) 
The calculation of the Jacobian for the transition to the new variables is in 


general rather cumbersome. For simplicity we assume that the charge is mov- 
ing along the axis Xo, so that v, = Ug. We then have 








CE ery Eroan 

ax’ ax’ ar ax’ 
a Ir—r'l vo x-=x' 

SEPET t— == > 

ax c c Ir-r’ 

al él 

oe exe 

p Os ay 

SAP eel ead gent 

ax’ ay’ dz’ 

al, al, al, 


104 ELECTROMAGNETIC FIELD OF ARBITRARILY MOVING CHARGES Ch. 4 


Thus, the Jacobian is equal to 


ally, ly, 1) ars vo(x — x’) 
a(x’, y’, z’) elr—r'l 


An analogous calculation for an arbitrary orientation of the velocity leads 
to the expression 
Aly ly, lz) a Yor r’) 
a(x’, y’, 2’) elr—r'l 


Hence one can write 


PRE OXYZ) 
dx’ dy'dz’ = a. LL) ox dl, dl, 
AS XE AD) 
dl; Dae dl 
7 yo(r—1) Vo (FT) 
~ elr—r'l T elr-r'l 


Transforming the expressions (25.1) and (25.2) to new variables, we obtain 


Ir—l—rol 


2 ovo -e a 
A(r, p=] (r—1—ro) lr—1l— rol = 
NP Mh = a AG (ee 


evo eVo(7) 252) 


[i oe [r a Soy 


where R(7) stands for the radius-vector drawn from the instantaneous posi- 
tion of the charge to the point of observation, i.e. R(r) = r— rọ. The value of 
the instantaneous position of the charge must be taken at the time 7: 








Ir—rol 
c 





T=f- 


The value: of the instantaneous velocity vo(r) is also taken at the time 7. 
Analogously, 








§25 . ARBITRARILY MOVING POINT CHARGE 105 











z (1) dl = 
vte.n=e[ Galati) r-i- rol 
e e 
= = 5 25.4 
fe Oe, a ORO 
icre (r—ro) R(T)- z 
Between y and A there is the relation 
A= Vov/e > (25.5) 


The potentials of the field of an arbitrarily moving point charge (25.3) and 
(25.4) are called the Liénard-Wiechert potentials. If the abbreviated notation 


Vo(7) R(7) 
Aq) = RG) = Z , (25.6) 
is introduced, then the Liénard-Wiechert: potentials assume the form 
A= eVg/cr j (25.7) 
g=elr. (25.8) 


The importance of the Lienard-Wiechert potentials lies in the fact that they 
characterize the field of a point charge in the most general form — for an 
arbitrary value of the velocity and the character of the motion. 

It is easy to see that for a velocity of a charge | vq! which is very small in 
comparison with the velocity of light, i.e. for |vg!l/c > 0 the expression for 
the scalar potential y goes over into formula (20.2), and-the vector potential 
A turns out to be small. However, in deriving the Liénard-Wiechert potentials 
we have put no restrictions upon the value of the velocity. Hence the Liénard- 
Wiechert potentials characterize the field of a point charge in the most general 
case — for an arbitrary character and velocity of motion. 

The field of a charge moving arbitrarily will be discussed in more detail in 
§ 20 of Part II, since a number of inferences from formulae (25.7)—(25.8) can 
become clear only in the light of the theory of relativity. 














Radiation Theory 


§ 26. The potentials of the electromagnetic field at a large distance 
from the emitter in the dipole approximation 


The general formulae for retarded potentials obtained in §23 are very 
complex. Indeed, since the expressions for the charge density and current 
density contained in (23.22) and (23.23) are functions of the delay time in 
the corresponding integrals, it is necessary to take the values of these quanti- 
ties at different times at each point of the system for the calculation of 
potentials at time t. Hence, except for the case of a single point charge con- 
sidered in the preceding paragraph, one is unable to obtain accurate concrete 
expressions for the potentials by means of the general formulae (23.22) and 
(23.23). 

If, however, the point of observation is located at a sufficiently large dis- 
tance from the system of moving charges, so that Irl >> L', where 
L'~\v'\"” is the characteristic linear dimension of the system, then the 
expressions (23.22) and (23.23) allow a simplification. Namely, the expression 
1/lr—r'l figuring in the integrals can be expanded in a series, as was done in 
calculating the fields of motionless (§15) and slowly moving (§21) charges. 
We then obtain 


§ 26 DIPOLE APPROXIMATION 107 














i nr) : 
a(x, dv P meu 
v=f E ~f(E +E) ofr tl ay'= 
_ dr=r'l r r c 
(rr a 
=f - 
rcr 3 = “ 
+ p('r- dV’, (26.1) 
J r3 c 


where r is the distance from the point of observation N to the origin O. 


It should be noted that f p(r’, 7) dV’ is by no means the total charge of 
the system, i.e. 


e+ fot, rT) dV’. 


Indeed, the value of the density in this integral depends on the argument 7 = 
t—l|r—r'l/e and represents a complicated function of coordinates r , r and 
time. The delay time is different for each point in a volume V’. For this 
reason the integrals in (26.1) cannot be calculated in a general form. 

A further simplification arises in the case where instead of a different 
delay time for each point of the system one can introduce one general delay 


time for the entire system. Namely, writing the argument 7 = ¢—|r—r'|/c in 
the form 


DIN Frere (SE er (26.2) 


we see that the total delay time |r—r'l/c is made up of two parts. The first 
one, equal to r/c and called the delay time of the system, represents the time 
needed for the propagation of the electromagnetic field from the origin O to 
the point of observation N. The second part, equal to r -r/cr and called the 
proper retardation, also has a simple meaning: it is the time needed for the 
propagation of the field to the limits of the system. According to the order of 
magnitude r’-r/er ~L'/c, and for |r| >>L’ the absolute value of the proper 
retardation |r'l/c is small in comparison with | r!/c. However, this does not 


mean that the charge density can be expanded in a series in the small param- 
eterr -r/er. 


srr 


108 RADIATION THEORY Ch. 5 





pte’, 7) =o (r', 1-7) = 
c 
-oha rp peA A 2 a 
(rt pa | p(x. z) +22 cr 
P TEE, 
SPT) + a p(r,T)- (26.3) 


Indeed, if in a time equal to the proper delay time r’-r/cr the configuration 
of the charges in the system manages to change appreciably, i.e. if in this time 
the charges manage to be displaced appreciably in the system, then the charge 
density at a moment f—r/c will differ substantially from the charge density at 
a moment 


C Cra. 


In other words, the charge density p will be a rapidly varying function of its 
argument, and it is inadmissible to make use of the equality (26.3). In order 
that this equality may hold, it is necessary that in the time r’-r/cr, in the 
course of which a field propagating with velocity c:crosses the system, the 
charges in the system moving with a velocity v will not be displaced appre- 
ciably. During the time r’-r/cr the charges traverse a path of the order of 


rer iy 
v— ~v—. 
cr c 


If this path is small in comparison with the dimensions of the system, it can 
be assumed that during the time of proper retardation the disposition of the 
charges in the system does not change. 

Thus, it can be assumed that for 


v(L'|c) KL’, 
or at the velocities of motion satisfying the inequality 


v&c, (26.4) 


"ot 
the change of the configuration in the time of proper retardation is small. 





§ 26 DIPOLE APPROXIMATION 109 


Then p(r’, 7) is a slowly varying function of its argument. This means that to 
small changes in 7 there correspond small changes in p, and use can be made 
of the expansion of p in powers of the small retardation. Substituting (26.3) 
into (26.1) and confining ourselves to the terms of the expansion containing 
the lowest powers of 1/r, we find 


ox f (E+ 54) (00779) + 6c, r9)) av" = 
as se + a ro) dV'= 


p(r',to) dv’ n 
= [1 


7. = r'plr', To) dV’, (26.5) 


where n= r/r. The term 


+ 


TEE 
Fa p(r, Tq) 
r 
is small in comparison with the term 
Ue oy 
SS [NOE su) 
cr? ¢ 


at a sufficiently large distance from the system. 

In formula (26.5) a very substantial simplification is made in comparison 
with (26.1), since the charge density at all points of the system is taken at one 
and the same moment 


TO A/e) 
The first term in (26.5) has a simple meaning: 
! Syl pod 
P(r’, To) (rt z) 


represents the charge density in the system at the moment 79. The integral 
Spr’, To) dV’ gives the total charge of the system. For an electrically neutral 
system it is equal to zero. In this case we have 


110 RADIATION THEORY Ch. 5 
= n s KA i ' 
om) JE oE Toar". (26.6) 


We rewrite the integral on the right-hand side of formula (26.6) by making 
use of the continuity equation (5.3): 


TE. + t= Op ðt Tal. + at I I 
Je o@',79) av! = fr ar arg OY JE ETa. 


The above integral is conveniently calculated in the coordinate representation. 
Namely, we have 


ðjy' x4 

+ x , I, 4 1 
et = = id 
fx , dx ayer fi eas 


where x} and xh are the limits of the region of motion of the charges, at 
which the current density reduces to zero, so that 


jy! 

x — dx’=—|j_.dx’. 
I ax’ J 32 

Correspondingly, in the vector form we obtain 


Jt 0G", 79) 4V' = fiav. (26.7) 


Substituting (26.7) into (26.6), we find the scalar potential as a function of 
the current density in the system 


y(r, t) == fie’, 79) AV". (26.8) 


In an analogous way we can obtain the expression for the vector-potential 
by expanding the expression 1/|r—r'| in formula (23.23) in a series and dis- 
regarding the proper retardation: 


A(r, t) = 4 fie’, 79) av". (26.9) 


Comparing (26.8) and (26.9) we find that there is between y and A the simple 
relation: 








§ 26 DIPOLE APPROXIMATION 111 
yg=A-n. (26.10) 


The integral f r' p(r’, 79) dV’ has a simple meaning. Namely, from the defini- 
tion of the dipole moment (15.14) we see that 


fiav'= fr o(',79) av" = 2 fr’ o(t',79) aV'= A(z) , (26.11) 


where d(rọo) is the derivative of the dipole moment with respect to time, 
taken at the instant Tg. Here we have used the fact that r’ is an independent 
integration variable which does not depend on 79. 


By means of (26.11) the expressions (26.8) and (26.9) can be written in 
the form 


-d 
ote, =e (26.12) 
d 
A(r, t) = =u (26.13) 


We see that in the approximation when the proper retardation can be dis- 
regarded the field potentials at a large distance from the system are deter- 
mined by the value of the derivative of its dipole moment with respect to 
time. Hence such an approximation in calculating field potentials is called the 
dipole approximation. The condition of applicability of the dipole approxi- 
mation is the fulfillment of the inequality (26.4). 

In the dipole approximation field potentials at a large distance from an 
electrically neutral system decrease according to the law 1/r, whereas the 
analogous electrostatic potential of an-electrically neutral system of charges 
at rest possessing a dipole moment varies according to the law 1/r2. 

It is easy to verify that the Lorentz condition is fulfilled for the potentials 
found in the dipole approximation. We have 
1 d 
V-A=-V. (To) _ 

c r 





= 2 V-d(ry) +> (aco) -y 2») ~E V-àlTo). (26.14) 


In calculating (26.14), the second term in the sum, which is proportional to 





112 RADIATION THEORY Ch. 5 


1/r2 is small in comparison with the first which is proportional to 1/r, and 
can be discarded. We see that in differentiating with respect to coordinates at 
a large distance from the emitter the quantity 1/r can be considered to be 
constant. Further, according to formula (1.39) we have 


: dd(r9) Berit. d-n 
Tiad 20 o cpg 
Thus 


pä Mes . 
VAs- -z icon. 


On the other hand, 


Lag 1 adins in igy, 

cat ctat rr c? © 
hence 

Ooy 

y Ato ðt oF 


as was to be expected. 

The results obtained have a simple and very important meaning: when 
charges in a system are moving (i.e. when the dipole moment of the system 
changes in time) an electromagnetic field arises in the surrounding space. The 
potentials of this field decrease relatively slowly (according to the law 1/r) 
with increasing distance from the system and depend on time. 

A system of non-uniformly moving charges is an emitter of radiation. 

In the following paragraphs we shall consider in more detail the radiation 
field and the properties of systems emitting radiation. 


§27. The electromagnetic field of dipole radiation 
at a large distance from the emitter 


Knowing the potential distribution, one can find the values of the mag- 
netic and electric field. We have ° 


d 
H=Vvx pea A 


r 





laa. 


§27 DIPOLE RADIATION AT LARGE DISTANCE 113 


In calculating the curl at a large distance from the emitter the calculation is 
to be carried out in the same way as for the divergence in formula (26.14): in 
differentiating with respect to coordinates the factor 1/r should be assumed 
to be constant. Then, according to formula (1.40), we find 


dd(rg) 1. 





TE; 


1 : 1 
=— VX d(T) =— V To X 
Ou deta dro c CAT, 


cr 


We have omitted writing d explicitly as a function of Tg; however, here and in 
all subsequent relations of this section we understand that d is a function of 
the argument Tọ =t—r/c. 

For any function of the retarded argumen: t — r/c we have 





T ea fa ee 
vi (s z)- Hvn Tek 
where it is assumed that 


= = 1 ` ny ely; n -À = 
preneta Vy=——A+tg=--At2 (n: À) 


=+ {nm A)— A} =4 (Àx n) Xn=—- (dXn)Xn. (27.2) 
c c c¢r 


Comparing (27.2) with (27.1) we see that the vectors Eand Hare connected 
by the relation 


E=HXn. (27.3) 


The strengths of the electric and magnetic field depend on coordinates and 
time according to the law 


IHI = IEI Ao, 


114 RADIATION THEORY Ch. 5 


As we have seen in §23, the above formula is an expression for a spherical 
wave. The amplitude of the wave decreases at a large distance from the 
emitter according to the law ~ 1/r. In this case the vectors of the electric and 
magnetic field are equal to each other in absolute value, and are perpendicu- 
lar to each other and to the radius vector ï. 

The region at a large distance from the emitter, in which the electromag- 
netic field is described by spherical waves, is called the radiation zone. Some- 
what later we shall define more precisely the notion of the radiation zone. 

We introduce the spherical system of coordinates r, 0, Y (fig. 1.10) with the 
polar axis directed along the vector d. The direction of the vector H is deter- 
mined by the vector [nX d], directed along the tangent to the latitude line 
on the surface of the sphere and oriented in the direction of decreasing azi- 
muthal angle y, so that 


[nxd],=0, [nxd],=0, [nxd],=~—dsing. 


Hence the vector H in spherical coordinates has the following components: 


e0, p a0 H =-2. sing. (27.4) 
Y c2r 


The vector E is directed perpendicularly to the vectors H and nalong the 
tangent to the line of longitude, and is oriented in the direction of decreasing 





pees er 





DIPOLE RADIATION AT LARGE DISTANCE 115 


reod a aing', B40, (27.5) 
9 cr 


Formulae (27.4) and (27.5) show that the field strengths have the largest 
values at 0 = 7/2 (in the equatorial plane) and decrease down to zero as the 
distance from the polar axis decreases. 

We calculate the Poynting vector of the emitting system: 


OO = EX Haze — H?n= = @ +H*)n= 


= cugn= ae [Ax n]? n= 
4ne 





(27.6) 


The Poynting vector turns out to be directed along the radius-vector and is 
equal in absolute value to 


pa E (27.7) 
4nc3 r2 

The fact that the Poynting vector is different from zero and is always 
directed away from the emitting system has an obvious meaning: there is a 
flux of electromagnetic. energy emitted by the system into the surrounding 
space. Formula (27.7) determines the flux density of the emitted energy as a 
function of the orientation in space (the angle 0) and the distance from the 
emitting system. The presence of the energy flux justifies the terms “radia- 
tion” and “emitter”, which we introduced earlier. 

We stress that in a factual observation of radiation in a certain direction n, 
only the value of the component of the vector of the second derivative of the 
dipole moment (d) in the plane perpendicular to n is effective, as is seen 
from (27.6). 

The energy flux through the vector surface dE enclosing a solid angle dQ 
is the power radiated in the solid angle dQ. For the power radiated d/ 


dI = 6: dE = odd = or? dQ = “H?72 dQ = d? sin” d* sin“ 0 ale 
4n A 

T n 

=LA in? 9 dody = InXd]" ag 


27.8) 
4nc3 4nc ( 


116 RADIATION THEORY Ch, 5 


The total flux of energy emitted by the system, usually called the total 
tadiated power, is equal to 


A 
— SF r- foar- 
dt 4nc3 





n 2n 32 
f sin? odo f ay= ee D, 
0 0 3 3 


Here (—dE/d?t) is the decrease in the energy of the emitting system per. 
second. 

The power radiated in the dipole approximation is determined only by the 
value of d(t—r/c). In other words, at a time ¢ the value of Z at a given point 
depends on the value of d at the preceding time t—r/c. But in other respects 
the power radiated does not depend on the distance from the emitting sys- 
tem, as is to be expected on the basis of the energy conservation law: the 
energy flux passing per unit time through any closed surface surrounding the 
emitting system has one and the same value. 

In conclusion we point to the relation holding for the density of the 
energy and momentum of the radiation. By virtue of formulae (27.6) and 
(13.11) we can write 


g=— o=—n. (27.10) 


We shall often encounter this important relation later. 
We see that in emitting radiation the emitting system loses both energy 
and momentum, which become the energy and momentum of the radiation. 


§ 28. Dipole radiation of simple systems 


As an example of the use of dipole radiation formulae we shall consider 
several simple systems. 
For a single charge, accelerated by a force F, we can write 


d=er, 
(28.1) 


d=er=eF, 
m 


where r — wis the acceleration of the charge. Hence 





_ ate 8: 


§28 DIPOLE RADIATION OF SIMPLE SYSTEMS 117 
2w2 sin2 

o= O TE E ue sin“ 0 (28.2) 
4n 4nc3r? 


Energy flux is absent in the direction of the acceleration vector (0 = 0) and 
has its largest value in the direction perpendicular to the acceleration vector 
(8 = n/2). The total energy emitted per unit time in the solid angle dQ (total 
power of radiation in the angle dî) is equal to 

ar= 2 sin? 9 an (28.3) 
4nc3 





and is proportional to the square of the acceleration. The total radiated power 
emitted by a single charge in all directions is equal to (see (27.9)) 


e2w2 





I = = (28.4) 


It is determined by the value of the square of the acceleration and the square 
of the charge: 


We see that every charge moving with an acceleration emits energy in the 
form of electromagnetic waves. 

The application of (28.2) to a single charge needs some explanation. In 
deriving formula (27.6) the expansion was carried out with respect to a small 
parameter — the proper time of retardation — which loses its meaning in the 
case of a system consisting of one charge. 

However, in §25 we found the expressions for the field potentials of an 
arbitrarily moving point charge. If one substitutes in (25.7) 


evy = fpvg dV =fjav=d, 


and considers the case of motion with a velocity which is small in comparison 
with the velocity of light, so that A(7) =r, then the expression for the vector- 
potential of the point charge is the same as formula (26.13). 

Thus, at a large distance from a slowly moving charge its field will be the 
same as that of a system of charges in the dipole approximation. 

Let us also calculate the loss of momentum of the emitting particle. By 
virtue of formulae (13.11) and (27.10) we see that the rate of loss of momen- 
tum of the charge while emitting is 


-$ = fga =; foan=o. (28.5) 


eee 


118 RADIATION THEORY Ch.5 


- The meaning of this result lies in the fact that the emission by the charge at 
angles 6, 7—@ is the same. The momenta in opposite directions are mutually 
compensated. It should be stressed that formulae (28.4) and (28.5) refer to a 
charge which was at rest at the moment of emission. It is only for such a 
charge that the relation (28.1) holds. 

Let us consider several simple examples of the calculation of radiation 
from a moving point charge. 

Let, for instance, the charge move in a uniform magnetic field. For sim- 
plicity we assume that the initial velocity of the charge Vo is perpendicular to 
the vector H. The charge moving in the magnetic field has the acceleration 


ENH 
m mc 


and, correspondingly, continually emits electromagnetic waves. The total in- 
tensity of the radiation is equal to 


e4 
m?c5 





[= 2 [xH]? (28.6) 


If the energy loss is assumed to be small, then the velocity of the charge can 
be considered as approximately constant: v~vVg. In this case vX H~ 
Vo X H= up. Hence 


4,272 4 2 2 ; 
e (86) 
mc m?c 2 
The energy emitted is inversely proportional to cî and is very small. However, 
it increases with charged particle energy and the effect of emission becomes 
substantial at very high particle energies, for example, for cosmic-ray particles 
in the Earth’s magnetic field, or for fast electrons moving in the magnetic 
fields of contemporary betatrons. Calculations have shown that it is the ener- 
gy losses due to radiation in a magnetic field which are the main losses limit- 
ing the energies attainable by particles in a betatron. However, it should be 
borne in mind that formula (28.6) is applicable only at velocities v << c. The 
case v ~c is discussed in § 26 of Part II. 

` As another example, let us consider the radiation by a charge oscillating 
according to the harmonic law: 


r= ro cos (wot ta). (28.7) 





§28 DIPOLE RADIATION OF SIMPLE SYSTEMS 119 


The acceleration of the charge is equal to 


t= were — wiry cos (wt + a), (28.8) 


so that the corresponding dipole moment is expressed by d= do(cos wot + a). 
Hence the radiation power emitted in the solid angle dQ is equal to 
2,4 
ew 
d7 = — [nX rp] cos? (wt + a) d9 . 
4nc3 





The mean (over one period) intensity of radiation in the angle dQ is 


aes 
d/ =— =O tnx eg? dQ. (28.9) 
To 


Formula (28.9) determines, in particular, the angular distribution of the 
radiation emitted. The total intensity of the radiation emitted by the oscilla- 
tor is, according to (27.13), given by the formula 


e 204 r2 
T= fa- (28.10) 


The oscillator emits electromagnetic waves whose frequency is the same as its 
natural frequency wọ. The intensity of radiation is proportional to the square 
of the amplitude of motion and the fourth power of the frequency. The 
particular importance of this example is in the following. 

At the beginning of the development of electron theory the well-known 
Thomson atomic model was proposed. It was assumed that the electron was 
located in the centre of a sphere formed by continuously distributed positive 
charge. The radiation by an atom in Thomson’s model was associated with 
small oscillations of the electron about its equilibrium position in the centre 
of the atom. Thus, the atom as a radiating system was reduced to an emitting 
oscillator, and formula (28.10) gave the intensity of atomic radiation. In 1911 
Rutherford’s experiments showed Thomson’s model to be incorrect, and it 
was abandoned. However, it turned out that the harmonic oscillator as a 
model of a radiating atomic system led in a number of cases to quite correct 
results which found experimental confirmation. The most important of these 
is the existence of definite frequencies wa of radiation, characteristic ofa 
given atom. Hence the oscillator remained in classical physics as a model of 


ener ey hel Aan 


120 RADIATION THEORY Ch. 5 


a radiating atomic system, although it could not be understood within the 
framework of classical theory why such a model, so far from reality, could 
give correctly the most important properties of atomic emitters. The situation 
was elucidated by the appearance of the quantum theory of radiation. In 
quantum mechanics we shall see that the quantum theory of radiation leads, 
in a number of cases, to relations which are formally the same as the expres- 
sions obtained for the classical model of the emitter. This results, roughly 
speaking, from the following. A number of properties of atomic emitters is 
determined not by some concrete law of the motion of the emitting particles, 
but simply by the periodicity of the process. On the other hand, the circular 
periodic motion of an electron with a constant angular velocity corresponds 
to the oscillation of a plane oscillator 


x =a cos (wyt +a), y =a sin (wot +a). 


Hence the model of the oscillator oscillating with the frequency wọ yields 
some characteristic features of the atomic emitter. We shall take into account 
this fact and, later on, discuss in detail the properties of the oscillator as a 
classical model of an atomic radiating system. On the other hand, we shall see 
in a number of examples that classical electrodynamics is not directly appli- 
cable to intraatomic processes and that it leads to relations which are quanti- 
tatively, and even qualitatively, in contradiction with experimental data. 

Let us now consider a system -consisting of two particles with charges e} 
and e, and masses m, and m3. For such a system 


d=e,r, ter, =e W] te, wW. 


If the system of two particles is closed, the accelerations can be written in the 
form 


w,=Fi/m,, w,=-F/m3, 


where F is the force of interaction between the particles. Hence 


cD e e 
ae 2) 
m; m3 


and the power of radiation in the angle dQ is 


1 ei ez 2 2 
a (= -2) [FX n]2a2. (28.11) 
mE 


4nc3 





Sa di 


§29 RADIATION REACTION 121 


The most important consequence of formula (28.11) is the statement that 
a closed system consisting of identical particles, or a closed system of differ- 
ent particles which have the same ratio e/m, cannot emit radiation in the 
dipole approximation. In order to find the radiation of such systems it is 
necessary to take into account effects of higher order (see §32). 


§ 29. Radiation reaction 


We have seen in preceding paragraphs that a single charge, moving with an 
acceleration, loses energy by radiation. In making the energy balance of a 
particle moving under the action of external forces it is necessary to take into 
account radiation losses. 

We have also seen that the radiation field possesses not only energy but 
also momentum. Owing to this the emission of radiation is accompanied by a 
reaction force of the field on the particle. This action of the radiation field on 
the motion of the particle is called radiation reaction. 

The balance of forces taking into account the radiation effect must be 
written in the form 


vXH 





mw=F+F, = fp Ex Javer,. 


The first term represents the external force acting on the particle, and F, 
represents the radiation reaction force, called also the Lorentz deceleration 
force. 

To calculate the radiation reaction force F, we can, in principle, proceed 
in the following way. We assume that the radiating charge is distributed in 
space. Dividing it into elements ĝe and Se’ we can calculate the action of the 
field emitted by an element Se’ on an element ôe. Summing then over all 
elements Se’ and Se we find the total self-force sought. The calculation 
described appears to be rather cumbersome *. Moreover, it can be carried out 
only in the case of a model whose validity is, from the standpoint of contem- 
porary quantum theory, rather weak. Therefore we shall concentrate on a 
different line of reasoning leading to the same expression for the force F,. 

We assume that the Lorentz deceleration force F, is small in comparison 
with external forces. The meaning of such an assumption will be discussed in 
detail below, when we find ES If this assumption is fulfilled, then in the first 


* W.Heitler, Quantum theory of radiation (Clarendon Press, Oxford, 1954). 


122 RADIATION THEORY Ch. 5 


approximation the charge performs its motion under the action of external 
field forces F. During this motion it radiates energy as given by formula 
(27.9). 

In addition, we assume that the charge performs a periodic motion or, 
more generally, that at a certain time t} it comes back to the initial state of 
motion which it was in at the initial time tp. We balance the energy for a 
system consisting of a charge performing such a motion and an external elec- 
tromagnetic field. It is obvious that, if the charge emitted nothing, then upon 


t 
its coming back into the initial state the total work W =f F-vdt done on it 
to 
by the external field would be equal to zero. Also the change A£ ext. fiela in 
the energy of the external field would be equal to zero. 
If in the next approximation we take into account that the total force 
acting on the charge is made up of the forces F and F,, then the energy 
balance can be written in the form 


t 


1 
J @+F,)-vdt=AE+ AE eyt field > 
to 


where AZ is the energy emitted by the charge during the time (¢, — tọ). Tak- 
ing into account that 


t1 
AE ext. peaa f F-vdr=0, 


to 
it can be written that 
t1 2e2 pl 3 
f Keva ARES w? dt. (29.1) 
to to 


‘Integrating the right-hand side by parts, we obtain 


A 2 ee 

f F.-vdt=-2“wey 422 f v-vdr. 
2 3 63 t 3 c3 

to 0 to 








§29 RADIATION REACTION 123 


Since at time f, the state of motion is the same as that at time fo, so that 


Vtg = Yr,» Wry = Wr,» We have 


== y=+—yw. (29.2) 


The radiation reaction force turns out to be dependent on the derivative 
of the acceleration of the particle. According to the assumption made that 
the radiation reaction force F, is small in comparison with the external force 
acting on the particle, the win (29.1) should be understood to be the acceler- 
ation of the particle in the external force field. If it were not so, i.e. if, for 
example, the reverse inequality F, >> F were fulfilled, then the equation of 
motion would have the form 





mw = Fo = 2 W. 
c 
Its solution is 
5 3mc? ; 
w = wg exp ee 


This last formula shows that under the action of the reverse radiation reaction 
the acceleration increases exponentially in time — the particle is self-acceler- 
ated. Such a self-acceleration contradicts the laws of classical mechanics as 
well as all experimental data. p 

Thus, the assumption that |F,| >> |F| leads to a physically senseless 
result. On the contrary, for |F,| << IF| the quantity w can, with a suffi- 
cient degree of accuracy, be considered as equal to the acceleration acquired 
by the particle in the external field under the action of the Lorentz force F. 
Assuming the latter to be a periodic function of time with a frequency w we 
can, obviously, write the following expression for the amplitude of F;: 


N 
= |. 
N 


SS 


e= 
3 


Il 
wi 
5 hs 
w 


2, 
IF. =3 Iwl =3 IFI . (29.3) 
c 








di 


“124 RADIATION THEORY Ch. 5 


Hence the condition of applicability of expression (29.3) for the radiation 
reaction | F,|<_| Fl assumes the following form: 


ewm? K 1 (29.4) 


Violation of the inequality (29.4) means that the radiation reaction is not 
small, and the latter leads to a physically incorrect result. Thus, the in- 
equality (29.4) is of fundamental importance for the applicability of radiation 
theory and the la‘vs of classical field theory in general. The classical field 
theory only leads to reasonable results which agree with experimental data 
for the frequencies 


w << mc? Je? (29.5) 


If m is understood to be the mass of the elementary charge — an electron — 
then the condition (29.5) turns out to be fulfilled at all optical and X-ray 
frequencies and even for not too high-energy y-rays. However, as will be 
explained in more detail in §17 of Part II, in the case of hard y-rays the laws 
of classical electrodynamics turn out to be inapplicable and quantum effects 
play the basic role. 

It is of interest to rewrite the inequality (29.5), introducing the wave- 
length A = 2mc/w into it. Then instead of (29.5) we have 


ADD> e*/me? =r, (29.6) 


For reasons which will be explained in §13 of Part II, the quantity ry ~ 
2.5 X 10713 cm is called the classical radius of the electron. 
_ The inequality (29.6) can be interpreted as follows. For electromagnetic 
_phenomena on the scale ~ rg the radiation reaction is not small in comparison 
with other forces. In this case the relations of classical field theory turn out 
‘to be inapplicable. Thus, in classical field theory there is an intrinsic limit of 
applicability; it is suitable for the consideration of phenomena taking place in 
_a region of space greater than the order of the classical radius of the electron. 
_ In what follows we shall see that the actual region of applicability of classical 
field theory does not extend to such small dimensions. It turns out that quan- 
tum effects, which impose a limit upon the applicability of classical concepts 
to microparticles, begin to play a role at distances ~ A = h/mc, where his the 
Planck constant equal to 1.05 X 10-27 erg-sec. The quantity A, equal to 
2X 10-19 cm for the electron, is called the Compton wavelength. We shall 
‘encounter this quantity in §17 of Part II, where the so-called Compton effect 
will be considered. 





§30 LINE WIDTH OF EMITTED RADIATION 125 


In quantum mechanics we shall discuss in detail the question of the range 
of applicability of the relations of classical theory, and give the proof of the 
statement presented. 


§30. Line width of emitted radiation 


The radiation reaction has a fundamental effect on the properties of the 
field emitted. We shall show that an emitter which in the absence of the 
reaction force would emit monochromatic electromagnetic waves with a fre- 
quency Wo in reality emits a range of frequencies in the region of wo. 

In other words, because of the damping effect the monochromatic radia- 
tion becomes radiation with a continuous spectrum of waves of all possible 
frequencies. 

We shal} present the proof of this statement for the simplest model of an 
emitting system — the linear harmonic oscillator. Let a charged particle 
move under the action of a quasi-elastic force.(—kx) along the x-axis, so that 
its equation of motion has the form 


or 


x= = wex + 5 caw 5 
mc“ 
where wg =Vk/m is the frequency of the oscillator in the absence of the 
force F,. 
Considering F, to be small in comparison with the quasi-elastic force 
(—kx), the acceleration can be assumed to be equal to the acceleration of the 


harmonic oscillator without radiation reaction, i.e. 


w=—wox, w=—wox, 
and it can be written that 
x+yxt+wix=0, (30.1) 
where 
2 
94) ew 
123 (30.2) 
mc 


126 RADIATION THEORY Ch. 5 


The expression 
1 a 
a OR (30.3) 


serves as the solution of eq. (30.1) for y << wọ. In this case it is to be under- 
stood that in (30.3) and subsequent formulae containing complex expressions 

| one has to take the real part. The solution (30.3) corresponds to the initial 
conditions 


x(0)=xg, x(0)=0. 


Formula (30.3) shows that, because of the radiation reaction determined 
by the quantity y, the oscillations are damped. The damping coefficient + Y 
is analogous to the damping coefficient of the mechanical oscillator in the 
presence of a frictional force. This justifies the second name of the force F, 
— the Lorentz friction force. 

To find the radiation of a damped oscillator we write its acceleration in 
the form 


1 a 
=R AA A A 
w=x=Ae 27 elo!” 


where A is a constant (in the same approximation (y << wọ), A #2 —x Ww). 

The acceleration of a damped oscillator is not a periodic function of time. 
Hence electromagnetic waves emitted by such an oscillator have no definite 
frequency. On the contrary, all frequencies 0 < w < % will be present in the 
radiation. This means that a damped oscillator emits a continuous spectrum 
of frequencies. 

In what follows we shall be interested in the energy distribution in this 
spectrum, i.e. in the fraction of the total energy emitted by the oscillator per 
frequency interval w, w + dw. This function /(w). called the Lorentz spectral 
distribution function, is related to the total energy J, emitted by the oscilla- 
tor: 





h= f Mw) do. (30.4) 
0 
The total energy emitted by the oscillator is 


co co- co 
i a: _2e2 2,,.2e2 2 
al) Ole Goma w? dt (30.5) 





§30 LINE WIDTH OF EMITTED RADIATION 127 


Here we extend the integration range to the region of negative times, since at- 
t <0 the oscillator was at rest and the integrand is identically equal to zero. 
We expand the acceleration in a Fourier integral: 


oo 


1 P 
W(w) et dw , 
V2n ih 





w(t) = 
where the Fourier component W(«w) is equal to 


A = f_ wine tor dma w(the ie dr = 


=A l 
VK [37 ~i(w —4)] 


According to formula (II.9) we have 





co co oo 





2 2 
f wat=f IWlo Ef ——Se—_ -4 
Lo E 2m o (wow)? +3 y 7 
(30.6) 
Substituting formula (30.6) into (30.5), we find 
2 = d DO 
o= ee y2 —2___-2£ 4° (30.7) 
3n c? Lo lwow)? +572 le y 
whence 
3c?y 
As Te (30.8) 
2e2 Q 


On the other hand, comparing (30.7) and (30.4) and taking into account 
that the spectral distribution is determined only for essentially positive values 
of the frequencies, we find 


To Y 


I = —. 
Se 


(30.9) 


RADIATION THEORY 


128 





Fig. 1.11 


The spectral distribution functions for different values of wg/2y are shown 
in fig. 1.11. They have a sharp maximum for w ~ wo, i.e. for the frequency 
which would be emitted by the oscillator in the absence of damping, 


, Iwg) = 21 olny 
For w = wo + hy the emitted intensity is equal to 
1 1 
Mwy +47) =}, 
i.e. is lower by a factor of two than the intensity in the maximum. For this 
reason the quantity > y is called the half-width of the emitted line. According 
to (30.2), there corresponds to the line half- width 4 y, the wavelength interval 


. 





§31 QUADRUPOLE AND MAGNETIC DIPOLE RADIATION 129 











G 2nc Aw _2nc y 
LAAI = 2r| A — | = — = —- —= 
w w2 we 2 
0 0 0 
_ me 2 e2 He e _2n, 
wg 3 c3m T Sie 


which does not depend on the wavelength and whose order of magnitude is 
equal to that of the classical radius of the electron. 


§31. Quadrupole and magnetic dipole radiation 


We have already seen above that in certain cases a system of charges can- 
not emit radiation in the dipole approximation. This does not mean, of 
course, that such a system cannot emit radiation at all. 

If there is no radiation in the dipole approximation, one has to look for 
higher terms of the expansion in powers of the proper retardation in the sys- 
tem, which will determine a radiation of higher order — quadrupole radiation, 
octupole radiation and so on. We shall restrict ourselves to finding the radia- 
tion of the next order beyond the dipole approximation. 

We write the vector potential of the emitting system, at a large distance 
from the system, in a form analogous to (26.5): 


(Pllc ‘sr Oj 1 
at o= $2] ier) +S iar = 


l iya fo ll dj 1 ' 
=— {j@,7,)dV + — | ~(m-r’))dV'=A,+A,. 
ral © a 1 2 


(81.1) 


According to (26.9), the first term of (31.1) describes the dipole radiation. 
Hence we shall be interested only in the second term: 


=} pitt 


2 ðTo 


(n:r) dV" En jm:r)dy' (31.2) 
n'r = n'r )dV 31. 
c?r c?r a 


Here we have made use of the fact that the constant vector nand the integta- 
tion constant r’ do not depend on time, and we have changed the order of the 


differentiation and integration. The value of the integral is taken at time To: 
We transform the integrand, symmetrizing it by means of formula (1.6). 


130 RADIATION THEORY Ch. 5 


We then obtain A 


1 
A, =— rXj)XndvV' 
z 2c? ð 879 Ne D) a) 


aa KE n)+r(j-n)} dV . (31.3) 


From the definition of the magnetic moment (22.1) it follows that the first 
integral can be written in the form 


] tyes fh, 
AG Xj)XndV’=MXn. 


In order to find the meaning of the second integral, it is convenient to go over 
from integration to summation *, projecting the vector expression onto a 
particular axis a in the Cartesian coordinate frame: 


JUT n + r'G-n)},4V'= Dievi D tere a= 


Z0. Meon eae 
Sari OD err; n)}a=3 a7 {DygNg 


where Dy, is the quadrupole moment of the system, determined by formula 
(16.3). The summation is carried out with respect to the index £ (6 = x, y, z). 
Going over to the vector expression, we can write 


* We draw attention to the transition from integration to summation. For a simpler 
expression it can be written in more detail: 


fee Bay'=2 fr gar =e fr “ev, 6(r'—r,) dv’ = 
ð à dy. dr; 
=F, Deut = DE; (= r;+y, a A 


The integration variable r’ is a constant in the differentiation. Only the coordinate of the 
ith charge r; depehds on time. Hence the order of the differentiation with respect to r 
and integration with respect fo r’ can be changed. However, after going over to the sum- 
mation it is necessary to differentiate rj. 








§31 QUADRUPOLE AND MAGNETIC DIPOLE RADIATION 131 


Js, oe dato 

JUT + Gm} av SID (31.4) 
where the vector D is defined by 

Dye Dosis - (31.5) 


Then, finally, we find for the vector potential 





-d,MXxn, D _ s 
cr cr 6c?” ip 


A By A nagn dip x Aad ? (31.6) 


where the dot denotes the differentiation with respect to the argument Tg on 
which the vectors d, Mand D depend. 

In the expression (31.6) the first term describes dipole radiation. The 
second term is determined by the derivative with respect to time of the mag- 
netic moment and, naturally, is called magnetic dipole radiation. The last 
term contains the second derivative of the quadrupole moment. It determines 
the quadrupole radiation. 

We estimate the order of magnitude of the terms of (31.6): 

1? 2 


2 ' 0 On eL < s ' 
d~el'o, M~<ovL'wx = , DeL’? w?. 





Hence 
IMXnl _evL'_v 
ldi ecL' c 
1 lDagngl _ew?L'? DIOR 


> 


LS 
> 


c idl ecwL' c A 


where L’ is the characteristic dimension of the system, and 2 is the wave- 
length of the radiation. 

Since according to our assumptions v << c and L' <j, the terms corre- 
sponding to magnetic dipole radiation and quadrupole raciation are very 
small in comparison with the first term describing dipole radiation. This 
means that magnetic dipole radiation and quadrupole radiation are significant 
only for systems for which the electric dipole radiation is absent. In general it 
cannot be said which of these two terms then gives the main contribution to 
the radiation. 


p N 


132 RADIATION THEORY Ch. S$ 


We shall calculate separately the intensity of the magnetic dipole radiation 
and the quadrupole radiation. 

According to formulae (27.8) and (27.1), the intensity of the magnetic 
dipole radiation is 


dI = Hr =i [AX n] ?r? dQ = 


m 


M? sin? 0 dð dy . (31.7) 





=| [(MXxn) Xn]? a= 
c? 


4n 4nc 


The intensity of the magnetic dipole radiatioħ has exactly the same form 
as the intensity of the electric dipole radiation (27.8), but in (31.7) we have 


M instead of d. 
The total intensity of the magnetic dipole radiation is obtained by inte- 


grating (31.7) over all angles: 


=2(M)? (31.8) 


c? 





N 
Gis 


The intensity of the magnetic dipole radiation is smaller than that of the 
electric dipole radiation by the ratio (v/c)?. 

It should be stressed that magnetic dipole radiation is absent for systems 
whose magnetic moment is proportional to the angular momentum. Accord- 
ing to what was said in §22, this holds for systems with a constant value of 
the ratio e/m, and also for systems of two arbitrary particles. Because of this, 
magnetic dipole radiation is absent in collisions between two particles. 

We now go on to the calculation of the intensity of the quadrupole radia- 
tion. Substituting the value of Agyag, into the general formula (27.6) and 
reproducing the calculations of §27, we find 


di © 72 240=—- fA X n]?r? dQ = 


quadr ` ; 4r 4nc quadr 


Den done (31.9) 


4nc 36c4 


We transform the square of the vector product according to formula (1.5) 
taking into account the definition (31.5): 





§31 QUADRUPOLE AND MAGNETIC DIPOLE RADIATION 133 


[Dx n]? = (D) —(n- D}? = DR Dann any =D, Dy Marya (31.10) 


Summation over repeated indices is implied. 
The angular dependence of dIquadr turns out to be very complex. How- 


ever, the total radiation per unit time can be calculated relatively simply. 
Integrating (31.9), taking into account (31.10), we obtain 





SIS 
quadr 7 gre 3604 [Bagi Der J Maly d? —DyD Pru Salar" ea] 
(31.11) 
The calculation gives: 
n,n, dQ =$ rôp, A (31.12) 
fra tgty My I2= FAS op Pay t 5or5 pu tapp) (81-13) 


Hence for / we find 


2 
Tqusae= 3-5 (Pap)? — 15 GaP E 


ET UE) l 
A (Dap) > (31.14) 





since, according to (16.6), the sum of the diagonal elements Dyg is always 
equal to zero. When the quadrupole moment v^ries periodically, the quadru- 
pole radiation turns out to be proportional to w6. The factor 1/c> in (31.14) 
makes its intensity very small. Nevertheless, there are important examples of 
systems for which quadrupole radiation plays the basic role. These are, first 
of all, some atomic nuclei emitting no electric dipole radiation (this is due to 
the law of conservation of angular momentum; for a justification of this 
assertion see Part V, §106). 

A closed system consisting of particles with the same values of e/m could 
serve as another example (see eq. (28.11)). 


134 RADIATION THEORY Ch. 5 


§32*. General case of electromagnetic radiation. The spectral decomposition 
of fields. The radiation zone and induction zone. Effect of the proper 


retardation 


We shall now consider a somewhat more general case of radiation emitted 
by a system of-moving charges. 

Without restricting the general character of the treatment it can be as- 
sumed that the current density in the system and the charge density may be 
expanded in Fourier integrals, so that they can be written 


j€ = fie’, w) idw, (32.1) 
p(r’, t) = foo’, w) el! dw . (32.2) 


In these expansions we make use of the normalization (II.1) and assume for- 
mally that the frequency may be negative as well as positive. In reality the 
frequency w is a quantity which is essentially positive, so that this representa- 
tion should be supplemented by the condition j(r’, w) = j(r’, —w). The quan- 
tity j(r’, £) will have a real value. Then, expanding the field potential in a 
Fourier integral 


Alr, 1) = A(t, w) el! dw, (32.3) 


and substituting the expansions (32.1)—(32.3) into eq. (23.23) for the vector 
potential, we find 


fac. w) elwt dw = 1 pie) exp lef = wn) | dw dV 


Ir—r'l 


Whence, for the Fourier component of the vector potential, we obtain 


A(r, w) = 1 fie) D exp| -i2 irri) av (32.4) 
c 
Analogously, 
g(r, w) = J> ea ep [iSi] dV’ - (32.5) 
C; 


If the current density and charge density are given as functions of.coordinates, 
then from formulae (32.4) and (32.5) one can find the Fourier components 








§32 GENERAL CASE OF ELECTROMAGNETIC RADIATION 135 


of the field potentials accurately, without disregarding the proper retarda- 
tion in the system. 

Hence the expressions for the potentials themselves can be found directly 
from the formula (11.2) of the Fourier transform. 

Thus, it is in principle possible to find the field of an emitting system with- 
out disregarding the proper retardation, and at any distance from the emitting 
system. Below we shall present a simple example of such a calculation. 

If we are interested in the field at large distances from the emitting system, 
then, assuming 


' rr 
lr—r'l xr- — = 


‘(r—r'), 


xis 


where Irl >> Ir'l, and introducing a quantity called the wave vector, 


k=2f=2n=kn, (32.6) 


where k is the wave number 


N 


n 
x (32.7) 


kaL 
c 


and n is a unit vector, we have 


e 





—ikr a 
Alr, w) = — fit eek av, (32.8) 


e` ikr aay, 
y(t , w) = = flac wikt gy", (32.9) 


when in the denominator the approximation |r —r’|*r is made. The for- 
mulae of spectral decomposition are often applied to the determination of 
the field of a single charge. In this case one can obtain a useful representation 
of A(r,w) by substituting for the Fourier component of the current density 
the current density itself according to the formula 


+co 
a 1 nee : 
j@ AE jr eT?! dr, 


136 RADIATION THEORY Ch. 5S 


and assuming 
jE’, t) =evd(r'—Ro(9), 


where Ro) is the instantaneous position of the charge at time rt, and v= 
dRj/dt. This gives 


—ikr + i 
f ev(t)e ilot—k- Ro] gz, (32.10) 


—co 


e 
2ncr 





A(T, w)= 


The above formula allows one to find the Fourier components of the vector 
potential from the trajectory of the particle — the dependence of Rg on time. 
The quantity r represents the absolute value of r — the distance from the 
origin to the point of observation at time f. If the Fourier components of the 
vector potential at a large distance from the system of charges are known, 
then the Fourier components of the field can be found from formulae (27.1) 
and (27.3). Namely, we have 


H(r, t) = fH(r, we des = VX Af, t) = 


=1 Ax n=} iw [A(r, w) X n] el! dw. 
Whence 
H(r, w) =i [A(r, w) X k] . (32.11) 
Analogously, 
E=Ż [(A(r,w)Xk)Xk]. (32.12) 


In general a radiating system emits all possible frequencies or, as is usually 
said, a frequency spectrum —% < w < œ, 

It is often of interest to find the weight of different frequencies in a spec- 
trum, i.e. the relative fraction of energy for a given frequency (or, more pre- 
cisely, for a frequency lying in an interval w, w + dw). We write the total 
energy radiated into a solid angle dQ in a time dż in the form 


c 
AE=fIdt= $ [H?r? 40 dt = d9 fJ dwdt, . (32.13) 





§32 GENERAL CASE OF ELECTROMAGNETIC RADIATION 137 


where J, dw dQ is the power radiated into a solid angle dQ in a frequency 
interval dw. 


Expanding H in a Fourier integral and making use of the Parseval equality 
(11.9), we have 


SEG, 0)? dt = 4r fH, w)? dw. 


Then for the power (energy per unit time) of the radiation in a solid angle dQ 
in a frequency interval dw we obtain the expression 


J, IQ dw = ¢ VH(r, w) 27? dQ dw = 


=4\tkx fie’, oye! avy |? ado. 2.14) 


The actual spectral distribution of the emitted energy is, naturally, deter- 
mined by the law of motion of the charges j(r’, t). 

The above formulae arẹ simplified if the current density and charge density 
vary at a single frequency w according to the simple harmonic law: 


jC D =j) i, 
. (32.15) 
p(t’, £) = polr’) eb! 


In this case the system, naturally, emits only one frequency w, and the vector 
potential is given using the general formula (24.16) with dr’ = dV’ 





A(r, t) = E pe oC) Ths [2 1r-rifar= Aplr) ei! 5 
£ (82.16) 


where the amplitude of the potential Ap (r) is, obviously, equal to 





jot’) 
A0) = f OM peP {ik Ir—r'l} av”. (32.17) 
r—1'| 
At a large distance from the emang system, 


i(kr—wf) 
A(r, t) = ——— fig (r’) ik T gy’ (32.18) 


138 RADIATION THEORY Ch. 5 


The quantity k-r’ obviously determines the retardation inside the system 


(the proper retardation). 
Now consider the case when the proper retardation can be disregarded. 


For this it is necessary that the inequality 
k-r’<1 


or 


<r (32.19) 


be satisfied. We see that the condition for disregarding the proper retardation 
is the smallness of the geometrical dimensions of the emitting system in com- 
parison with the wavelength of the radiation. 

Assuming that the condition (32.19) is fulfilled, we have, by virtue of 
(26.11), 


—i(kr—wt) 


A, ) == fig’) av’ = Loy 


(32.20) 


which is in agreement with (26.13). 

Let us also find the scalar potential, assuming, however, that the distance 
to the point of observation is not large in comparison with the wavelength. In 
order not to repeat the calculations of §26, we write the expression fr the 
scalar potential making use of the Lorentz condition: 


as 





y=-c fv: -Adt=-cV- 


ea. 


ik+ >). (32.21) 


We see that two limiting cases can be realized: 
kerari, 


ker~r/XKX<1. 


In the first case, when the distance from the emitter is large in comparison 
with its dimensions, we obtain 





§32 GENERAL CASE OF ELECTROMAGNETIC RADIATION 139 


n-d(79) 
r 


g(r, t) = ik (32.22) 


which is in agreement with (26.12). It should be recalled that in formula 
(32.22) the real part of the corresponding expression is implied, and the 
factor i plays no role. Thus, in the range 


r>A>S L' 


all the formulae of §26 are valid. This is the region of large distances or the 
radiation zone. 
In the range of distances 


riL 
we can write 


n:d(ro) n-do ei% To 


y(r, t) ~ 
pA 2 


(32.23) 


In this region, called the neighbouring or induction zone, the scalar potential 
y is the same as the potential of a dipole electrostatic field, whose value 
varies according to a harmonic law. The vector potential A in this zone is 
small in comparison -with the scalar potential by the ratio r/A. This means 
that in the induction zone the field has, basically, the character of the electric 
field. The strength of the electric field varies in the entire induction zone in 
phase with the variation of the dipole moment of the system of charges, and 
the field does not have a wave character. 

The magnetic field is smaller than the électric field by the ratio |H|!/| El= 
r/X in contrast to the radiation zone, where the relation (27.3) is fulfilled. We 
see that in emitting long waves (A >> L’) the field of the emitting system has 
an induction character in the region r << X, and a wave character for r >> À. 
The angular distribution of the electric field in the two regions is different. In 
the induction zone the energy flux averaged over the time, characterized by 
the Poynting vector, is equal to zero. Thus, the induction zone gives no con- 
tribution to the radiation. 

We also write the expressions for the fields in monochromatic waves at a 
large distance from the emitting system: 





140 _ RADIATION THEORY 


2 E 
q-— nX dg eilet #7) , (32.24) 


2 3 
E=% (nx Ineo kr) (32.25) 


and in the vicinity of tne emitting system: 


3n(dg-n)—do 
=e 


jor (32.26) 
r 3 


> 


E 


H=0 


In the radiation zone the Poynting vector is determined by formula (27.6). 
Hence the emitted power averaged over the time is equal to 





wd cde, 2r \4 
= =— (=) (32.27) 


T= 
3c? 3NA 


It turns out to bë inversely proportional to the fourth power of the wave- 
length and directly proportional to the squareʻof the amplitude of the dipole 
moment do. Finally, we shall discuss an example of the calculation of the 
field of the emitter taking into account the retardation in the system. We con- 
sider a system in which the current density is expressed by the formula 


j’) =k, sin (5 KL — kl 21) 8(x) 8(y) el! (32.28) 


Here k; is the unit vector along the z-axis. 

Formula (32.28) has a simple meaning. In an infinitely thin line of length 
L; directed along the z-axis, a current having the character of a standing wave 
is produced. At the ends of the line at z = + 5 L the current density reduces 
to zero. Such a system iş called a linear radiator and is often used for emitting 
radiowaves. 3 

Substituting (32.28) into (32.18), we obtain for the vector A 


ky ia 
A(r, t) a sin AKL — k|z|) elk? €958 ei(wt-k:T) gz i 
-3L 
2 


where 0 = (n - k;). On integrating we obtain 





§32 GENERAL CASE OF ELECTROMAGNETIC RADIATION 141 
_ kL? ltk ) 
ASMK o a a F(0,k), (32.29) 


where the factor F (0, k) is equal to 





8 cos(} kL cos @)—cos 5 kL 


Bee R212 sin? 0 


(32.30) 


The factor in the brackets corresponds to the emission of waves whose wave- 
length is A>> į L (very long waves) and for which the proper retardation 
can be disesardeal It is obtained directly from (32.20) upon substituting 
formula (32.28), in which ït is assumed that 3 kL >0. 

The factor F(0, k) characterizes the Sacer retardation in the system. It 
tends to one as $kL > 0. For a finite value of kL it corresponds to a sub- 
stantial change in the angular distribution of the radiation. 

For kL =7 (ie. L =4X) the angular distribution of the radiation differs 
relatively little from that of a simple dipole. For kL = mm, where m is the 
integer number of half-wavelengths within the length of the linear emitter, 
there arise distributions within values of @ at which the radiation has a maxi- 
mum value, the most intense maxima being nearest the polar axis. 

With increasing m these intense maxima progressively approach the axis of 
the emitter. In the limit, as m °°, the radiation turns out to be directed 
along the axis. 


eee 




















Electromagnetic Field in a Vacuum 


and Electromagnetic Wave Scattering 


§33. The propagation of electromagnetic waves 
at a large distance from the emitter 


In the preceding paragraphs the phenomenon of emission of electromag- 
netic waves has been studied. We can now analyse the mechanism of propaga- 
tion of electromagnetic waves in a space free of charges (in vacuum). 

We have seen in the preceding chapter that radiating systems emit electro- 
magnetic waves for which the equal-phase surfaces are spheres of radius r. 

We now pass over in formulae (32.24) and (32.25) to the limit, assuming 
the distance from the system of charges to be so large that one may disregard 
the difference in the direction of the vector rg and n, where rọ is the unit 
vector directed from the origin to the point of observation, and n is, as be- 
fore, the unit vector directed from the emitter to the observation point. 

We can then write 


PVN as ren 


and make this substitution in the phase factor, after which formulae (32.24) 
and (32.25) will take the form 


2 5 
H=- lnx do exp iof: =-=") ; (33.1) 
cr 


c 


142 





§33 PROPAGATION OF ELECTROMAGNETIC WAVES 143 


p= 2 Unx do) X n exp iw (= (33.2) 
c? r c 

In formulae (33.1) and (33.2) the distance from the point of observation to 
the origin enters in the amplitude in the form of the factor 1/r. However, it is 
clear that at an infinitely large distance from the origin the variation of the 
function 1/r can be assumed to be very slow and we can set 1/r ~ const. In 
this case the variation of the fields with distance is determined solely by the 
phase factor. Then (33.1) and (33.2) can be written in the form 


E= E, exp iw (:- En) : (33.3) 


H= Hy exp io(r-"2), (33.4) 


where Ep and Hg are quantities which are constant in space and time. 

We introduce the vector k= wn/c, called the wave vector. In absolute 
value k = w/c = 2n/A, where À is the wavelength, and is oriented in the direc- 
tion of propagation of the electromagnetic wave. Then 


BoE etot— kh (33.5) 
HEH eer kn), (33.6) 


In formulae (33.5) and (33.6) it is implied that in the final expressions the 
real part is to be taken. According to (27.3), the amplitudes Ey and Hg are 
perpendicular to each other and to the vector k: 


Ep =HyXn. (33.7) 


The vectors E and H represent plane monochromatic waves: the planes 
k-r=const are equal-phase surfaces (surfaces of equal values of the strengths 
Eand H) (fig. 1.12). 

Thus, we arrive at a natural result. At sufficiently large distances from the 
emitter spherical waves approximate to plane waves. This can be pictured in 
an obvious way as follows. If the emitting system is sufficiently distant, the 
radius of curvature of the spherical equal-phase surface of the electromagnetic 
wave is very large in comparison with the dimensions of that region of space 
in which the field is considered. Hence in the limits of this region the sphere 
can be assumed with a sufficient degree of accuracy to be a flat surface. 





144 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


Plane k-r = constant 


Fig. 1.12 


If the direction of propagation‘is chosen to be the x-axis, then formulae 
(33.5) and (33.6) can be written in the form 


IB Ee nan aso: (33.8) 
H=H, eier ka) i (33.9) 


Formulae (33.7)—(33.9) describe plane waves propagating in the positive 
direction of the x-axis with a velocity 


v=wlk=c. (33.10) 


Clearly at time (¢ + 1) the phase factor at point (x + c) has the same value as 
it had at time ¢ at point x. Thus, v represents the velocity of propagation of 
the equal-phase surface and for this reason it is called the phase velocity. 
The propagation of plane waves, in contrast to spherical waves, is not 
accompanied by a decrease in their amplitude. 
We calculate the Poynting vector in a plane wave: 


ee xcs = Ch = H? E2 +H | 
o= 4 EX H= 3; (HXn) XH= ne aos mmm OSLO 


4 81 
(33.11) 


It turns out to be constant and equal to the energy flux moving with the 
velocity of light. The momentum of a plane electromagnetic wave is equal to 








§33 PROPAGATION OF ELECTROMAGNETIC WAVES 145 
g=o/c? =nu/c. (33.12) 


The expressions we have found for the field in a plane wave can also be ob- 
tained directly from the solution of the field equations in a region of space 
free of charges. 

Since the charge density and current density in the part of space consid- 
ered are equal to zero, the equations for the electromagnetic field potentials 
assume the following form: 

v2A-+ #AL i 
c? ar2 


y2 IWON, 
c2 ar2 


1 ay 


V Atar 


=0. 

For simplicity we consider first the case where the field potentials depend 
only on the x coordinate, so that the equations for the potentials have the 
form 





a2A(x, 4) 1 3?A(x,t)_ 5 


(33.13 

əx? c2 atd? ) 

a2y(x, t) 1 d2y 

SAAE SEER O, (83.14) 
əx? c2 ðt? 

dA (x, t) 1 dy 

-e (33.15) 


Moreover, we assume that the field depends on time accoraing to a simple 
harmonic law. We then look for the solutions of eqs. (33.13)—(33.15) in the 
form 


A(x, t) = Ag(x) ei", (33.16) 
v(x, t) = pa) el. (33.17) 


The substitution of (33.16) and (33.17) into (33.13) and (33.14) gives 
for Ag and yo 





146 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


d2A, 2 
ee EK 9. 


dx? c? D 
Poe) o o 
dx? a a 


Whence 
A =l eTikx 4]! etikx | 


Vo =a e` ikx $ a'etikx x 
where k = w/c, and 1, 1’, a, a’ are constants. In this case we find for the poten- 
tials 


A(x, 1) = beet kx) + 1/ elotti) , (33.18) 


y= weilwt—kx) +o! glut kx), (33.19) 


The first terms in (33.18):and (33.19) represent a monochromatic plane 
wave propagating in the positive direction of the x-axis, while the second 
terms represent the same wave propagating in the negative direction of the 
x-axis. Which one of these waves is actually excited depends on the arrange- 
ment of the emitting system. Without restricting the general character of the 
treatment one can consider one of these waves, for example the first one. 

The amplitudes 1 and @ are not arbitrary, but are related by the Lorentz 
condition (33.15), which by substitution of (33.18) and (33.19) into it gives 


a= L 

The field potentials are then written finally in the form 
A=leilvt—kx) =]ei¥ , (33.20) 
pal, eet kx) = A-j=]-iel¥ , (33.21) 


where i is the unit vector along the x-axis, and Y = wt — kx. 
The field strengths in the plane wave have the form 








§33 PROPAGATION OF ELECTROMAGNETIC WAVES 147 


__12A_y -_128Aay_ ayy. 
EcLon. VC Ee JW oDm OW 


=-2A-gvy= —kA+t kgi=k{i(A-i)— A} = 
=k(AXi)Xi=E, et” =E, eler— >) , (33.22) 


H= VX A= (Vy) X A=k AX i= Hy elt **) | (33.23) 


where differentiation with respect to the argument y is denoted by a dot, anu 
Ep and Ho are the amplitudes of the field strengths whose moduli are ob- 
viously |Egl = |Hg! =kl1l. The vectors Eg and Hg are perpendicular to 
each other. 

We write the expressions for the field components: 


E,=0, H,=0, 
E,=-kA,=H,,  H,=kA,=—E,, (33.24) 


A Siih, a RLS —kA, =E, 3 
In these formulae, as weu as in the preceding ones, the real part of the com- 
plex expressions written for the potentials and fields is implied. 

In the general case, when the direction of propagation of waves does not 
coincide with the x-axis, the unit vector i should be replaced by the unit 
vector in the direction of propagation n (fig. 1.12), and the relations (33.22) 
and (33.23) are the same as (33.5) and (33.6). The values of the wave ampli- 
tudes |Eo! = Ho! remain completely arbitrary. They are connected with 
the amplitude of the waves radiated by the emitter. 

It is useful to compare the expressions which we have obtained with anal- 
ogous formulae obtained for another potential gauge, which is often used in 
the literature *. 

As we have seen.in §11, field potentials allow the gauge transformation 
(11.1) and (11.2). We perform the transition from y to the new value y’, 


` * L. D. Landau and E.M. Lifshitz, The classical theory of fields (Pergamon Press, 
London, 1962). 


148 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


and choose the function Y =c f ydr. Then y’ = 0, i.e. in the representation 
considered the field is described only by the vector potential A’(x, t) = 
A+V y. Such a choice of y is not always possible but can only be made in 
vacuum, when p = 0 and the equation for y allows a zero solution. 

The Lorentz condition is then written in the form 


“Ox. 


The solution A, = const # 0 does not differ from A, = 0, since it leads to the 
same values of field strengths. It is easy to see that for such a gauge we again 
obtain for the field components the values (33.24). 

If the wave process does not have a simple periodic character, i.e. is not 
described by a monochromatic wave, one can easily obtain the general form 
of solution of the equation for the potentials y and A. The solution is ob- 
tained in a way analogous to that irr §23. 

Introducing the new variables 


t=x-ct, n=x+tcet, 
we reduce the equations for the potentials to the form 


2A _ ay _ 
o O uoa 


As the solution of the equation for the vector potential we can write 
A= A(ct —x) + A'(ct +x), 


where A and A’ are arbitrary functions of the arguments ct—x and ct +x, 

respectively. The first of these represents a general expression for a plane 

wave propagating with velocity c in the positive direction of the x-axis. The 

second one represents a wave propagating in the negative direction of the 

x-axis. An analogous expression is also obtained for the scalar potential. 
The Lorentz condition leads to the equality 


PA TAi, 


which is valid for an arbitrary form of the functions A and y. 











§34 PLANE WAVE POLARIZATION 149 


Finally, it should be noted that in a vacuum one can obtain wave equations 
for the field vectors E and H directly. For this one has to take the curl of the 
Maxwell equation 


VX (V X H)=V(V-H) — V7H=- veH=ty x 3E 
so that 
2 
v2H__ 07H _ 
c2 ðt? 


The wave equation for Ecan be obtained analogously. 


§34. Plane wave polarization 
Let us consider in somewhat more detail how the changes in the field 
vectors of a plane monochromatic wave take place. To do this we pass over 
from the complex to real expressions. 
In formula (33.22) we write the complex amplitude Ep in the form 
E = 8) tig), 
where g; and g, are real vectors. We then have 
E=Re{(g, + ig,)[cos(wt — kr) + i sin (wt—k-r)} } = 
= Re {[g, cos (wt—k-r)—g, sin (wt — k-r)] + 
+ i[g, sin (wt—k-r) + g, cos (wt — k-r)] } = 
= g, cos (wt—k-r)—g, sin (wt — k: r) . (34.1) 
We now transform from the vectors g-and 82, Orientated at an arbitrary 


angle to each other, to mutually perpendicular vectors E} and E, (fig. 1.13). 
Let 


E =g; cosatg, sina, (34.2) 


E =g; sina- g, cosa, (34.3) 


150 ` PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


E, 





7 Ei] 
gı cosa 


-92 COS a Ez 


Fig. 1.13 


where a is an unknown angle which is to be found. Taking the scalar product 
of the expressions (34.2) and (34.3), we find 


E -E,=0= (8? ~g3) sin a cos & — 8, ` 87 (cos? a — sin? a) , 
whence 
P) . 
pe col 82 
si —83 
Writing 
g; =E, cosatE, sina, 
g2 =E; sina—E, cosa, 
and substituting into (34.1), we find 
E=E, cos (wt -k'r + a) + E, sin(wt—k-r+a). 


Let the x-axis be chosen as the direction of propagation of the waves. If 
the vector E, is taken in the y-direction then the vector E, will be in the 
z-direction (in the positive or negative sense). Then 


E, =E, cos (wt — kx +a), (34.4) 
E, =E, sin(wt—kx + a). i (34.5) 





§34 PLANE WAVE POLARIZATION . 151 


The quantities £, and E, are called amplitudes, while the quantity y= 
wt — kx + wis called the phase of the wave. 

From formulae (34.4) and (34.5) it is easy to eliminate the phase by writ- 
ing 


~ +21. (34.6) 
E1 E3 


The expression (34.6) relates the values of the components of the vector Ein 
the plane wave. It is obvious that inʻa given plane x = const, the vector E 
rotates in the (yz) plane in such a way that its end describes an ellipse 
(fig. 1.14). 








Fig. 1.14 


Formula (34.6) represents the equation of this ellipse. If, in particular, the, 
emplitudes E} and E, are equal in absolute value, the vector E rotates on a 
circle. i Az 
Since the propagation of the electromagnetic wave takes place in the direc- . 
tion n, one can picture the change of the vector E in space‘and time in an 
obvious way as the motion of its end on an elliptical (or circular). helix drawn 
about the line n. The pitch of the helix is equal to the wavelength A = 2n/k. 

By virtue of eqs. (33.24) .he following expressions can be written for the 
components of the magnetic field strength: 


H, =—E, sin (wt — kx +a), (34.7) 


H, =E] cos (wt — kx + a). (34.8) 


152 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


Waves in which the vectors E and H rotate on an ellipse are called ellipti- 
cally polarized, and those in which the vectors rotate on a circle are called 
circularly polarized. 

The direction of rotation of the vector E is determined by the phase a. If 
the rotation is clockwise for the observed looking in the direction of propaga- 
tion of the wave, then such a wave is said to have left-handed polarization or 
positive helicity (so for example for a= +i provided that g, and g, are 


directed along the positive axes). 
If one of the vectors E, or E, is equal to zero, the change of E and H 


takes place in mutually perpendicular directions. Such waves are called plane 
polarized waves. According to historical tradition, the direction in which the 
vector H oscillates is called the polarization direction. Thus, for example, a 
wave polarized in the z-direction has a component H, different from zero, 
and a component E, of the same magnitude, oscillating in the y-direction. 


§35. Interference and the formation of wave packets 


The monochromatic plane wave considered above appears to be only an 
idealization of real electromagnetic waves. On the one hand, the monochrom- 
atic plane wave, which is a process strictly periodic in space and time, should 
obviously have an infinitely large extension in space and an infinitely long 
duration. On the other hand, there are no strictly monochroniatic emitters. 
As we have seen in §30, the effect of damping shows itself in the emission of 
frequencies differing from the natural frequency wọ (even if close to it). 
Hence to describe real wave processes it is necessary to consider the result of 
superposition and interference of different monochromatic plane waves. 

Let us consider the superposition of an infinitely. large number of mono- 
chromatic plane waves whose frequencies are continuously distributed in a 
narrow interval wg —Aw < w < wg + Aw, where wo, called the carrier fre- 
quency, satisfies the condition wg >> Aw. 

We consider the amplitude of all the waves to be constant. For the strength 
of the electric (or magnetic) field we can write 


‘WotAw i Wothw — 
E= f Ee Snr) dw = Ey f ellwt—kx) doj. 
wosw wor åw 


For the general character of the results, which we shall need later, we assume 
that the wave number k is a function of the frequency w not necessarily cor- 
tesponding to the relation k = w/c valid for electromagnetic waves in vacuum. 





§35 FORMATION OF WAVE PACKETS 153 


Setting 
w= wy +(wW—wWo), 
dk ap (tere 
we find 
WotAw * dk 
E=E eilwot- kox) exp i(w— w [e (E) x Jao= 


; ‘ Aw dk 
=E Seach exp iu| t—(——} x |du= 
0 dw] 
—Aw 


sin Aw [z — (dk/dw)g x] 
t— (dk/dw)g x 





= 2E, eilwot- kox) (35.1) 


where the integration variable u = w—wg is introduced. We arrive in principle 
at the following result: the superposition of waves with a spectrum with fre- 
quencies lying in 2 narrow interval 2Aw about the carrier frequency wo 
leads to the appearance of a wave with frequency wg and wave number Ko 
but with a modulated amplitude 


_ 2Ep sin [Aw/v, (x — vgt)] 





(35.2) 


? 


eu (x — ug t) 


rE (35.3) 


The modulated amplitude has a very sharp principal maximum (fig. 1.15 
where the dependence of A on > (Awo) (x — vp t) is given) at the point 


Xm = Ust 


m Banla 


where it is equal to 


AS =2AwE, 5 


154 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


in m Zr on 


Fig. 1.15 


In both directions away from the position of the maximum the value of the 
modulated amplitude decreases, and at points where 


A 
29 (xv th=ta 
Ug g 


the modulated amplitude reduces to zero. In addition to the principal maxi- 
mum, the modulated amplitude has a number of subsidiary maxima in which, 
however, the amplitudes are very small in comparison with that in the prin- 
cipal maximum, and the heights of which decrease rapidly with increasing 
argument (see fig. 1.15). In practice, it can be assumed that the electromagnetic 
field is excited only near the principal maximum, while elsewhere in space 
the superposition of the waves leads to their complete mutual cancellation. 

The group of waves form a wave packet. The wave packet moves in space 
with a velocity v, maintaining a limited extension in space. Hence the quan- 
tity v, is called the group velocity of motion of the packet, as distinct from 


g 
the phase velocity 


with which the equal-phase surface of a single monochromatic wave moves in 
space. It is obvious that the energy of the wave packet travels with its ampli- 
tude, i.e. with a velocity v,. The dimensions of the wave packet are deter- 
mined by the relation 


deol 1- lhal =2n. (35.4) 





§35 FORMATION OF WAVE PACKETS 155 


For a more obvious explanation of the meaning of the equality (35.4) let 
us find the spatial dimensions of the packet. At a fixed moment f¢ the field 
differs from zero between the points x, and x which are a distance 


2n 2r 


T äw(dk/dw)o Ak ESS 


xx, Ax 
apart. Outside this region the field has a value close to zero. 
If now a certain location x = const is considered, then the duration of the 
time interval during which the field of the wave packet differs from zero is 
equal to 


At ~ Zr/Aw . (35.6) 


After the lapse of time At the field will reduce to zero at the given location. 
Thus, the wave packet has a limited spatial extension and duration, satis- 
fying the conditions 


Athw~2n, AxAk~2n. (35.7) 


Now the statement of the problem can be modified. Suppose we want to 
obtain a wave field different from zero in a certain region of space Ax. To 
obtain such a field from monochromatic waves it is necessary to form a wave 
packet. 

If the dimensions of the packet are given by our condition, then according 
to (35.7) the superposition of monochromatic waves with wave numbers 
lying in an interval Ak must be carried out. The narrower the wave packet, 
i.e. the smaller its spatial dimensions, the larger Ak, i.e. the larger the interval 
of the wavelengths which must take part in the formation of the packet. 

A wave packet existing for a limited time at a certain location can be con- 
sidered in a completely analogous way. The shorter the required duration At 
of the packet, the larger the frequency interval Aw of those monochromatic 
waves which must form it. 

The relations found for waves propagating along tne x-axis can easily be 
generalized to the case of waves propagating in an arbitrary direction in space. 
One then obtains the relations 


Ax Ak, =2r, Ay.Ak, =2n, Az Ak, =2n, AtAw ~2n. 
(35.8) 


The results obtained are of a great theoretical and practical significance. 


156 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


A monochromatic wave completely homogeneous and infinitely extended 
in space and time, cannot, as was pointed out before, be realized in practice. 
However, by making use of an emitter with sufficiently weak damping, radiat- 
ing for a sufficiently long time, one can produce in space waves whose proper- 
ties are sufficiently close to those of monochromatic waves. It is clear that 
monochromatic waves of infinite extent cannot be used for the transmission 
of sighals. 

We understand by signals electromagnetic perturbations which in principle 
can be detected by means of suitable devices and which can provide informa- 
tion about physical events. Thus, for example, the fact of emission of radia- 
tion, beginning at a certain instant of time, is a signal. The detection of a 
system of charges producing the scattering of electromagnetic waves is anoth- 
er example of a signal. 

From the above it is clear that electromagnetic waves can be used for 
producing signals only in the case where wave packets are formed by them. 
The relations (35.8) are used for the analysis of the properties required of the 
signal. Let, for example, a recording device need for its operation a signal 
whose duration is not shorter than a certain value At. Then the signal will be 
recorded only in the case where it represents a wave packet formed of mono- 
chromatic waves with frequencies distributed within an interval Aw > 27/At. 

We shall not dwell on other examples of the application of the relations 
(35.8). They play an important role in quantum mechanics. 

In conclusion we stress that the expression which we obtained for the 
wave packet is only of an approximate character. It is valid if: 

1) the amplitudes of all monochromatic waves forming the packet have 
one and the same value; 

2) in the expansion of k(c.) in a series in Aw one can restrict oneself to 
the first term. 

The first restriction has no great significance, and finding wave packets 
formed by waves with different amplitudes does not present any particular 
difficulty. 

The second requirement is fulfilled automatically for electromagnetic 
waves in a vacuum: 


However, as we shall see in quantum mechanics, in cases where the require- 
ment 2) is not fulfilled, important consequences follow from taking into 
account further terms of the expansion: the form of the packet, which, in 





§36 SCATTERING OF ELECTROMAGNETIC WAVES 157 


the first approximation, is constant in time and space, turns out to be variable 
when further terms are taken into account. The wave packet progressively 
deforms and diffuses. 


§36. Scattering of the electromagnetic waves by a free charge 
and by a bound electron 


Let us now consider the effects arising when an electromagnetic wave is 
incident on a system of charged particles. 

The electromagnetic field of the wave acts on the particles of the system 
with the Lorentz force 


Fe (E+ XT) mek. 





Since in the electromagnetic wave |H| =|!EI!, the magnetic part of the 
Lorentz force is smaller than the electric part by the ratio v/c. Each particle 
of the system dcquires an acceleration w = F/m, and becomes in its turn an 
emitter. Since each particle emits spherical electromagnetic waves, the direc- 
tion of emission of these waves is not dependent upon the direction of propa- 
gation of the incident wave. Hence it is said that the initial wave is scattered. 
The frequencies of the emitted waves are the same as that of the incident 
wave. The scattering taking place without a change in the frequency is called 
coherent scattering. In addition to coherent scattering, in some cases there 
may occur scattering with a change in frequency — incoherent scattering 
(see Part V). i 

The scattering of electromagnetic waves by a bound charge can be consid- 
ered as an example of the linear harmonic oscillator. Let a plane polarized, 
monochromatic plane wave be incident on the oscillator. The equation of 
motion will have the form 


2 


> elas: eEg ei! 
rt w ery Soar ia 


mc? m 


Assuming the damping to be weak, we can write Y= we r, so that 


z as ’ 
f+ wort yr=— Eye", (36.1) 


158 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


where y, as in §30, is equal toze 2ed/me3. The particular solution of eq. 
(36.1) which interests us is 


x (e/m) Ep ei? 


: (36.2) 
we — w? + iwy 
Formula (36.2) describes the motion of the oscillator under the action of an 
external force. 
For the power of the radiation scattered into a solid angle dQ we find, 
according to formula (27.8), 


e2 
dI = — [r x n] 2 dQ = 
4nc3 
4 2 s 
ews EG Dekso oa cos? (wt -6)d2. (36.3) 
Amera (E E w?)2 + wey? 


The scattered radiation has the same frequency as the incident radiation, but 
is shifted in phase by an amount 


§ = arctan? _ 
ww? = wr 

Introducing the incident intensity Io =.cE z 87 and averaging over the period 

T= 2n/w, we find : 


yi 
1 wt sin? 9 dQ 
=f dite = Sa (36.4 
ae) of 0 (wg — ws)? + wy? ) 


where rg is the classical electron radius (29.6). Here & stands for the angle 
between the direction of observation n and the direction of the vector of 
polarization Ep (fig. 1.16). 

The process of scattering of electromagnetic waves is characterized by the 
differential cross section. By definition the differential cross section for scat- 
tering into a solid angle dQ is the ratio of the intensity of the radiation scat- 
tered into the angle dQ to the intensity of the incident radiation: 


do = (36.5) 


of =] 





§ 36 SCATTERING OF ELECTROMAGNETIC WAVES 159 





Fig. 1.16 


From (36.4) we find for the differential scattering cross section 


HA sin? 9 oO Bae 

z (r = w?)? + wy ) 

As is seen from formula (36.5) the differential scattering cross section has the 

dimensions of cm? (it is this which accounts for the term “cross section”). 

Formula (36.5) gives the cross section. for the scattering of plane polarized 

light. The angle 0 and the polar angles 0 and y are mutually connected by a 

relation which is easily established from fig. 1.16. In this drawing the polar axis 

z is directed along the wave vector k of the incident wave, while the axis x is 

directed along its vector of polarization. Projecting the unit vector in the 
direction of observation n onto the axis x, we find 


cos v = sin 0 cosy, 
or 
sin? 9 = ] — sin? 8 cos? y. (36.7) 
In practice it is often important to know the scattering cross section for 
unpolarized radiation. In order to find it, it is necessary to average the cross 
section (36.5) over all possible polarizations, i.e. over all possible orientations 
of the vector Eg in the plane (xy). This means that it is necessary to average 


the expression (36.7) over all possible values of the azimuthal angle y, i.e. to 
assume 


sin? 9 = 1 — sin? 0 cos? y = 1-3 sin?@=4 +4 cos? 0 . 





-ar 


Ferh 


160 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


We then obtain 


raw 1 + cos? 0 Gas 





y (og — w?) + w?y? 2 
The angular dependence in formula (36.8) shows that the strongest scattering 
takes place in the direction of the incident radiation (0 = 0) and in the oppo- 
site direction (6 = 7). 

Integralıng (36.8) over the entire solid angle we obtain the total scattering 
cross section for unpolarized radiation 


rĝwt 


anua (36.9) 
(w2 — w2)? + w?? 


=i 
OS 3 


This expression is called the dispersion formula of classical electrodynamics. 

The dependence of the total cross section on frequency is expressed by a 
curve similar to that shown in fig. 1.11, with a strongly pronounced maximum 
for wa~ Wp, i.e. for the frequency of the incident radiation close to the 
natural frequency of the oscillator. Assuming in formula (36.9) that w ~ wo, 
we obtain near the resonance 


730 


eee BG? : (36.10) 


ox 


m 


win 


The quantity y characterizes the width of the resonance region. In particular, 
for exact resonance w =), the cross section in the maximum is equal to 
asp 

Cave Stair (36.11) 

P 

Since y << Wg, the cross section for the resonance frequency attains very 
large values. This phenomenon plays an important role in the optics of mate- 
rial media. It is called resonance fluorescence *. 

The dispersion formula, obtained in the example of the scattering of radia- 
tion by a harmonic oscillator, has in reality a very general character. It is of 
the same form as the corresponding formula for the scattering of light by 
atoms obtained in quantum mechanics (Part V). We shall see that in quantum 
mechanics the region of applicability of the dispersion formula is not re- 


* Comparing (36.10) with formula (III.4’) we see that for y >0 the cross section 
behaves as a §-function. 








§ 36 SCATTERING OF ELECTROMAGNETIC WAVES 161 


stricted to the scattering of light, but also extends to a number of other sys- 
tems. 

Let us consider the two relations obtained from (36.9) in the limiting case 
of low and high frequencies. For low frequencies w << wo 


Win 
> 


(36.12) 


3 
ow 
E 


i.e. the cross section is proportional to the fourth power of the frequency or 
inversely proportional to the fourth power of the wavelength of the incident 
radiation. 

This law of scattering is of a very general character. It is applicable when 
the wavelength of the light scattered is large in comparison with the dimen- 
sions of the scattering object. This case is often called Rayleigh scattering. 

For high frequencies w >> wo formula (36.9) is again simplified: 


Snr  8net 
ox aie, sA (36.13) 





Chis is called the Thomson formula. 

The cross section turns out to be constant, depending neither on the fre- 
quency of the radiation scattered nor on the properties of the oscillator. This 
fact has a simple meaning. At high frequencies the force acting on the charge 
due to the field is very large in comparison with the quasi-elastic force. The 
electron scatters as a free particle. According to the Thomson formula, the 
cross section g is a universal constant determined by the classical radius of the 
electron. It is clear that the Thomson formula describes the scattering by 
electrons belonging to any system, for example atoms, if one can disregard 
the forces binding the electrons in the atoms and consider the electrons to be 
free. The scattering of radiation by heavy nuclei can be disregarded, since the 
sross section is inversely proportional to the square of the mass of the scat- 
ierer. 

The universal character of formula (36.13) makes it one of the fundamen- 
tal relations of classical electrodynamics. It has been put to a thorough experi- 
mental test, the results of which are shown in fig. I.17. 

We see that the ratio exp! Othom = | only for wavelengths larger than 
about 2 A. 

For smaller wavelengths the classical treatment of scattering processes 
turns out to be inapplicable. Here we encounter the fact mentioned in § 28. 
Although the limit of applicability of classical electrodynamics, inherent in 


162 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


St I 





1 107 10? 10° 10% 
à (A) 
Fig. 1.17 


the theory, occurs at the scale of wavelengths A~ rg ~ 10-5 A, actually it 
already appears at a scale 105 times larger. This is associated, as we have 
already stressed, with the manifestation of quantum effects. The trend of the 
cross section in fig. I.17 is in excellent agreement with the predictions of the 
quantum theory of radiation (Part V, the Klein-Nishina formula). 


§37. Absorption of radiation 


In addition to scattering, the radiation interacting with matter undergoes 
absorption. In classical electrodynamics this effect can be calculated for the 
oscillator model. We shall restrict ourselves to the case where the frequency 
of the incident radiation is close to the resonance frequency, when the ab- 
sorption is greatest. However, the frequency of the radiation cannot then be 
assumed to be exactly equal to the natural frequency of the oscillator. From 
what follows it can be seen that in such a case the calculation would not lead 
to a well defined result. Hence it should be assumed that the incident radia- 
tion has a continuous frequency distribution. 

We can expand the incident field in a Fourier integral, writing 


E(t) = [Elwe dw. 


§37 ABSORPTION OF RADIATION 163 


Substituting this into the equation of motion of the oscillator (36.1) and ex- 
panding the displacement of the latter in a Fourier integral, 


r(1) = fro) ei deo, 
it is easy to find that 


r(w) = e O (37.1) 


m w- w? tiwy 


The energy loss is obviously equal to the total work done by the field on the 
oscillator. The latter can be calculated from the formula 


-AE =W= |F- vdt=e [EG rdr. (37.2) 


Proceeding in an analogous way as in deriving the Parseval formula (II.9) we 
find, using E*(w) = E(—w), r*(w) = r(—w) and eq. (37.1): 


+00 +00 Ses ae 
=P ell f ao. f ieor(w) elo! dart f aio: if E(w) el? deo] 


_ 2ne inn * = ja 
E f iw[E*(w) -r(w) — E(w): r*(w)] dw = 
= sxe! w OI n Te? f YEW)? dw 





(37.3) 
m 

wer ey)? + w 2y2 0 (wo —w)? + 47)? 

The integrand has a sharp meximum in the region of resonance. We assume 
that the spectral distribution of the radiation absorbed in the region of reso- 
nance varies slowly in comparison with the resonance factor. Then the inte- 
gral can be calculated approximately. Substituting w = Wo + raid we get 


i WEW) do rE a 
(wo—w)? + Gy)? —2woly (yx)? + Gy)? 





+00 
x~ i 2_dx = 2 
2 J |E(co9 +4-yx)| a ~ MIE)? 674) 


Se 


ar. 


164 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


We have replaced the lower limit of the integration by infinity, since y << wọ. 
Substituting (37.4) into (37.3), we find 


2n2e2 
= NS | E(w) I? = 2n7c*r, Elwo)!’ 3 (37.5) 


The energy absorbed turns out to be independent of the physical properties 
of the absorbing system, and depends only on the position of the resonance 
frequency wg. Hence the expression obtained is of a very general character 
as in the case of the dispersion formula. Very similar expressions for absorp- 
tion are also obtained in the quantum theory of radiation (see Part V). 


§38*. Canonical form of the field equations 


In passing over to the quantum theory of the electromagnetic field it is 
convenient to give to the electromagnetic field equations a form very similar 
to that of the equations of mechanics. It turns out that the electromagnetic 
field can be compared with a certain mechanical system, and that D’Alem- 
bert’s equations can take the form of Hamilton’s equations describing the 
motion of this mechanical system. We shall confine ourselves to the case of 
the electromagnetic field in vacuum. 

Making use of the Coulomb potential gauge described in §11, for which 
y = 0, we write the field equations in the form 


2 
y2a— LA- (38.1) 


V-A=0. (38.2) 


The region in which these field equations are valid is the space which is free 
of charges and currents. At the limit of this region a boundary condition 
must be given. Such a condition may take, for example, the expression for A 
in the form of a retarded potential. However, we shall not be interested in the 
amplitude of electromagnetic waves, and shall confine ourselves to the con- 
sideration of the field in vacuum at a large distance from currents and 
charges. We divide the entire region of space which is free of charge into a set 
of cubes (we shall call them the normalization cubes) of edge L, and consider 
the field inside one of the cubes. Then the boundary condition must be given 
at the surface of the normalization cube. 





§38 CANONICAL FORM OF FIELD EQUATIONS 165 


As we know, the electromagnetic field in a vacuum represents a set of 
travelling waves. The solution in the form of travelling waves is obtained if it 
is assumed that the vector potential A and its derivatives have equal values at 
opposite faces of the cube. This is equivalent to the requirement: A is a 
periodic function of the variables x, y, z with a period L: 


A(x, y,z)= A(x + L, y + L,z + L). (38.3) 


Having obtained the solution of eqs. (38.1) and (38.2) in a normalization 
cube with edge L with the boundary condition (38.3), we can write the solu- 
tion in the entire space by simply repeating the solution in the initial normali- 
zation cube. The final results will not depend on the choice of L. 

We look for the solution of (38.1) and (38.2) in a normalization cube in 
the form of a set of expressions of the form q;(t) A,(r), where q(t) depends 
only on time, and A,(r) depends only on coordinates. 

Each of the vector functions A;(r) represents a wave in the normalization 
cube. Since the volume of the latter is finite, it can contain a denumerably 
great number of standing or travelling waves. Thus, we set 


Ar, t) = 27 4,(OA,(), (38.4) 
i 


where the index 7 runs over an infinite but discrete number of values. This 
means that the vector potential of the field in the normalization cube can be 
expanded in a Fourier series. The number of terms in the sum (38.4), i.e. the 
number of waves in the normalization cube, is infinitely large. 

Substituting (38.4) into (38.1), we find 


D (GOV? a0 Awr))= 0. (38.5) 


1 


Since all waves forming the superposition (38.4) are independent, the equality 
(38.5) must hold for each of the waves, i.e. 


q0) VA; 0- 4,0 A0) = 0. (38.6) 
c 


We rewrite (38.6) in the form 


wO AO k eA (38.7) 


166 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


Since [A;@)]x and q;(t) are functions of different variables, the equality 
(38.7) can hold only if its right-hand side and left-hand side are separately 
equal to a constant, called the separation constant. For reasons which will be 
clear from what follows, this constant must be an essentially negative quan- 
tity. Denoting it by (— w?), we find 
w? 
V?A,(r)+ =i A,(r) =0 (38.8) 
c 


4) + w?q,()=0. (38.9) 
Expressions of the type 


A;~ e; sin kpr A 
(38.10) 
A;~e,cosk;-r, 


serve as the solutions of eq. (38.8) satisfying the conditions of periodicity 
(38.3). 
The components of the vector k; must assume a discrete number of values 


2mn il 21 i2 217 73 


r A r ET 





(38.11) 


where ”;1, 1;2 and ;3 are positive integers. The whole set of such values of the 
components of k; ensures the presence of nodes or antinodes at the faces of 
the normalization cube. The absolute value of the vector k is by definition 
equal to Ik; = w/c. The vector ej is the unit polarization vector, which can 
assume two values (j = 1, 2) for a given k,. 

Instead of sines and cosines in formula (38.10) we could take their arbi- 
trary linear combination. 

Eq. (38.2) leads to the requirement 


e;'k;=0, (38.12) 


meaning (see §33) that the waves are transverse. 

The whole set of functions A; is given completely by eqs. (38.1) and 
(38.2) and the boundary condition (38.3). It appears to be the same for any 
fields in the normalization cube. 

For the actual determination of the vector potential at a given point of 


§38 CANONICAL FORM OF FIELD EQUATIONS 167 


space at a definite instant of time it is necessary to give the set of all time 
amplitudes q;(¢). This means that the state of the field is characterized by 
defining an infinite set of amplitudes q; (£). 

The latter are determined by eq. (38.9), which is the same as the equation 
of motion of the linear harmonic oscillator. The quantity w; represents the 
frequency of this oscillator. The value of q,(t) determines the state of the ith 
oscillator at any moment t. The definition of the whole set of the variable 
amplitudes q;(t) is equivalent to the definition of the state of a whole set of 
an infinitely large number of cscillators with frequencies w;. If the state of all 
the oscillators at given moment f is known, then the field is also known at 
this moment. Thus, the electromagnetic field can formally be replaced by a 
mechanical system with an infinitely large number of degrees of freedom — a 
set of an infinitely large number of oscillators usually called field oscillators. . 
The whole set of states of the field oscillators q;(t) characterizes the state of 
this mechanical system and, at the same time, the state of the field. 

The quantities q;(¢) can be considered as the set of coordinates of a me- 
chanical system whose equations of motion can be written in the form of 
Hamilton’s equations. One can ascribe to this system the Hamiltonian func- 
tion (which is the same as its energy): 


1 
H=} 2 (p? + w?q?), (38.13) 


where p; is the momentum conjugate to the coordinate q;. The mäss of all 
field oscillators is assumed to be equal to unity. 

Indeed, Hamilton’s equations for the ith degree of freedom (i.e. the ith 
oscillator) read: 


q;=3H/ðp;=P;> P;=—3H/ðq;= — 074)» 


whence the eq. (38.9) for the quantity q; follows directly. 


A particular solution of the field eqs. (38.1) and (38.2) can be written 
in the form ; 


A; ~ Cie cos wt sin Kyr + Cy ;e; sinw,tcosk,;r. (38.14) 

If one chooses properly the normalization of the linear combination of 
solutions of the type (38.14), from which the general solution (38.4) of the 
field equations (38.1) and (38.2) is formed, one can make the energy of the 
field in the normalization cube equal to the energy of the system of field 
oscillators (38.13). For such a normalization of the vector potential we can 





168 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


assume that the system of field oscillators is completely equivalent to the 
electromagnetic field. The state of the set of oscillators is uniquely related to 
the state of the field. 

The necessary linear combination of solutions of the type (38.14) has the 
following form: 


a= JED eo L q; (0) sin kpr + wq; cos k,-r) = 


i j=l k; 


SV es pa DD e, OP; sink,-1+w.q,cosk,;-r). (38.15) 


7 peste i 


Indeed, we shall verify the fact that the energy of the field in the normaliza- 
tion cube satisfies the requirement 


= (zz L390 (524 242 
ere T= 3 > (7 7a) (38.16) 
i j= 


Oscillators with different polarizations are considered to be different. The 
total number of field oscillators is obtained by summing over i and j. This 
amounts to doubling the result. 

From (38.15) it follows that 


2 
Sea SD YD eG, sink, r+ w4 cos kyr) = 
c ija ~i 


aN BUDS aor q; sin k; r+ wp, cos k,-r) = 
== VED Dey (19; sin kyr +p, cos k;-) = 
ij 


=>)E;,, (38.17) 


§38 CANONICAL FORM OF FIELD EQUATIONS 169 


2 
H=vx BN, oy 2) 4 VX (e; sin kj-r) + 
i j=1 Ñi 


+ wq; VX (e; cos k; D} = 
4 1 : = 
= =i yD 1, (p;k;X e,cosk;-r+ 4; k;X e; sink,-r)= 
i of i 


1 





4n k;X ej . 
= SDD : (p; cos kj- r+ wq; sin k; r) = 


k;X E; 
IDY 7 S (38.18) 
i i 





Let us work out the expression for the integrals (1/87) f E2 dV and 
(1/87) f H2 dV. From (38.17) it follows that 


1 2 reas | 3 $ 7 2 
JE W= 5 SÈD Deensk, r+p,cosk,-r)} dV 
Gy 
In calculating the integrals we make use of the obvious relations 
fein kpr sinky “rdV= {sin kpr cos k»-rdV= 
= foosk,-r cos kpr dV = 
= [sin kyr cos k,rdV=0. 
J sin? k,-1 dV = focos? k,-rdV=2V=423 
i i 2 DA 


Hence it is easy to obtain the expression 
1 2ay=1 Dy} DD 
mde d= 4 27 20; torai) 
iJ 


Quite analogously one obtains from (38.18): 


170 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 


il pan 1 1 
— | H* dV= — e.—(p,k.Xe,cosk.-r+ 
+f a I[EDez,OmXs | 
+ wq; k;X e; sin k;-r) dV= 


= 4 272, (p; + oap). 
uo 


Adding the two expressions, we arrive at the equality (38.16). 

For the normalization method which we have chosen — i.e. the selection 
of the factors before sin k;-r and cos k;-r — the field in the normalization 
volume can be compared with a set of field oscillators whose energy is the 
same as that of the field. The state of the oscillators — the set of the quan- 
tities q;(f) at a moment t — determines the value of the vector potential 
A(r, t). 

Thus, the electromagnetic field in a finite volume is formally equivalent to 
a mechanical system with an infinitely large but denumerable number of 
degrees of freedom — a set of field oscillators — while the field equations are 
formally equivalent to Hamilton’s equations of motion of the field oscilla- 
tors. The Hamiltonian function of the equivalent system of oscillators is often 
called simply the Hamiltonian field function, while the expansion of the vec- 
tor potential (38.15) is called the expansion of the field into oscillators. It 
should be stressed that within the framework of classical electrodynamics the 
expansion of the field into oscillators is a mathematical device. Field oscilla- 
tors cannot be associated with the oscillations of any real particles. However, 
this expansion plays a most important role in the quantum theory of the 
electromagnetic field (see Part V). 

In the quantum theory of the electromagnetic field the expansion (38.15) 
is often expressed in terms of exponential functions. Later, in Part V, we 
shall need such a representation. We write the expression for the vector 
potential A(r, £) in the form 


Alr, t) = 2) (b, A, + bY AS). (38.19) 
À 


Here the following notations are introduced: 
A. =e. N 4c? ik, "r 
À A L3 2 


A* =e me eika tt , 


À A 








§38 CANONICAL FORM OF FIELD EQUATIONS 171 


The index A, over which the summation in (38.19) is carried out, replaces the 
indices i and j in (38.15), i.e. it runs over a double (in comparison with /) 
series of values, corresponding to the two directions of the polarization vector 
(e; and e3). The term b, A, characterizes the wave propagating (travelling) in 
the positive direction of the vector k,. The second term bX AX represents the 
wave propagating in the opposite direction (—k,). Thus, the waves having the 
wave vectors (k,) and (—k,) are considered to be different waves. The com- 
ponents of the vectors (k,) and (—k,) assume the values given by (38.11), 
but with both positive and negative values of the integers n;ņ mj, ni3. This 
means that to one standing wave there correspond two travelling waves — in 
the positive and negative directions. 

Substituting A, and Aj into (38.19), we rewrite this formula in the form 


2 è x 
Alr, = VAD) e, (bp eika t + bg mika Fy 
A 


Comparing the above expression with (38.15) and equating coefficients of 
eika ʻI and e~iKa‘l, we find 


Feiler? 
UNIO. (ip, + w.q,) > 


(38.20) 


In this notation the field energy is written in the form 


AN 2 Pil 2 2,2) = 2 
B= 5 272) (P; + 0747) 5 DPX + O44) = 27) b btw? - 
ites A A (38.21) 


Later we shall also need to calculate the number of field oscillators with 
given frequency and given polarization. It is obviously equal to the number of 
travelling waves in the volume V = L3; In order to find the number of oscilla- 
tors it is convenient to make use of a simple geometrical construction. We 
choose n; , n and n3 in formula (38.11) to be the coordinates in an imaginary 
space of the numbers (7, 7,3), omitting the suffixes / for clarity. 

Fig. 1.18 presents a part of this space. To each possible value of n} and n3 
in this drawing there corresponds a point. We introduce the quantity 


= 2 2 2 
n= nj +n3 +n}. 





| 172 PROPAGATION OF ELECTROMAGNETIC WAVES Ch. 6 





Fig. 1.18 


If the numbers n}, 72, n3 are sufficiently large, then the points representing 
them lie very close to each other and fill the entire space in an almost con- 
tinuous way. The quantity n, as a function of n], 7, n3, will vary almost 
continuously, and in fig. I.18 will be represented by a radius vector. 

Since to each set of three numbers n} , na, ng there corresponds a definite 
value of k given by 


f 2N E E 2n 
k=Ikl=7 ny+nytnz=n7, 


the nu nber of waves with a k lying in the interval between k and k + dk is 
equal o the number Of the values of n lying in the interval between n and 
n + dn. The latter is equal to the number of representative points within a 
spherical layer lying between spherical surfaces with radii n,n + dn. For this 
number we have, obviously, the value 


g(n) dn = 4nn? dn. 


Thus, the number of travelling waves or the number of field oscillators 
with Ik| lying in the interval k, k + dk and with a given polarization in a 
volume V = L? is equal to 


4nk2 dk L3 
(27)? 


g(k) dk = 4mn? dn = 





§38 CANONICAL FORM OF FIELD EQUATIONS 173 


The number of oscillators with a given polarization and a frequency lying in 
! the interval w, w + dw is 


4nw? dw L3 


g(%) dw = (2ne)3 


(38.22) 


“atnes one has to make use of the formula for the number of field oscilla- 
tors with given frequency, given polarization and a direction of the vector k 
lying within a solid angle dQ. The number of such oscillators is equal to 


2(w) dw $2 = w2? dw LÌ? dQ 


= eas (38.23) 


We shall need the expression for the momentum of radiation in á volume 
L. Writing it on the basis of (13.11), (38.17) and (38.18) in the form 


1 z l k, X E, 
C= gp JEX Bar- za f| EXD jo- 


pn Sey k, £2 z 
a T =| Daeg 8? = De 





we can ascribe to each oscillator an energy 
5 Pip Yaak 0) A 
GA 2b, byw, Pi + wxqy , (38.24) 
and a momentum 


P, =k,e,/ck, =g, , (38.25) 


so that 


E=ije,, G= Dp). 





The Motion of Particles 


in Electromagnetic Fields 


§39. The motion of charged particles in constant electric and magnetic fields 


One of the branches of electrodynamics which is important from the prac- 
tical point of view is the theory of motion of charged particles in electric and 
magnetic fields. The theory of the motion of charged particles in electro- 
magnetic fields is at the basis of electronics, accelerator technology, electron 
and proton microscopy, mass-spectrography, investigation of reactions in 
plasma, and experimental facilities for the investigation of thermonuclear 
phenomena. It is very important for a number of other fields of physics — 
astrophysics, physics of cosmic rays and so on. 

In this book we shall restrict ourselves to the consideration of the simplest 
problems. We shall assume that the field in which a particle is placed has a 
strength which is very large compared with that of the particle itself. In other 
words, we consider the particle whose motion interests us as a test particle 
which does not distort the given external field. 

We begin with the motion of a charged particle in a uniform electric field 
constant in time. The equations of motion have the form 


2 
m PED. 
dr? 





§39 CONSTANT ELECTRIC AND MAGNETIC FIELDS 175 


If at the initial instant t= 0 the charged particle was at rest, then, taking the 
direction of the field along the x-axis we have 


mx=eE, 


my=0; 


it can be shown that 





v=x =+ V2eV/m + vå = + V2eV/m , (39.1) 


where V = (p7 — 91) is the accelerating (or decelerating) potential difference 
traversed by the particle. Here it is assumed that vg = 0. 

If the particle at the initial instant was at the origin and had a velocity ug 
directed at an angle @ to the y-axis, then the double integration of the equa- 
tions of motion gives: 


-_ etE ` 
x=— u y sind, 
m 


2 
_ et-E ; 
x Som + (vo sin oyi A 
J =v cos 6 , 


y = (vo cos@)t. 


Eliminating t from the expressions written for y and x, we find the equation 
of the trajectory: 


eEy? 


x= (tan 0) y + ———>_ 
2m (vg cos 0)? 


(39.2) 


As was to be expected, the particle moves in a parabola. 


176 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


Let us now consider the motion of a particle in a constant uniform mag- 
netic field. We orientate the z-axis along the direction of the field. The equa- 


tions of motion have the form 
~_e 
mr="vXH, (39.3) 


or, taking the components, 


2, e 
AEE yH, (39.4) 
ý=- xH (39.5) 
mc i 3; 
z=0. (39.6) 


Eq. (39.6) means that the magnetic field directed along the z-axis has no 
effect on the component of the motion of a particle in this direction. We look 


_ for the solution of eqs. (39.4) and (39.5) in the form 


x =A cos (watta), 
y =B sin (wot+ a). 


We then have 


from which we find 


Wo =eH/mc, A=-—B. (39.7) 


Thus, we can write 





§39 CONSTANT ELECTRIC AND MAGNETIC FIELDS 177 


z eH 
= = $ 
x =A cos (Wt + a) =A cos iS t a) 5 
(39.8) 


; eH 
=—AÁsi =—A sin|—tt+a). 
y A sin (wot ta) A sin (£ t a) 


It is obvious that 
ah 2m0) 
A vaik v, vf j 


where vo is the initial velocity in the plane (xy). 
Integrating the expressions obtained once more, we have 


v9) 
eX e chen Ce Oe 
vf 
= + $ $ 
VEO te cos (Wat + a) 


Eliminating the time from the above relations, we find that the particle moves 
in a circle 


KO 


e E E = Tz =RE- 
Cc 


The frequency of rotation of the particle, given by formula (39.7), is called 
the cyclotron frequency. The cyclotron frequency, equal to double the 
Larmor frequency, does not depend on the initial velocity of the particle and 
is determined by the ratio e/m. The radius of the circle 

vf mov” 


= ——— = 39.9 
C we eH (39.9) 





has the following simple meaning: on a circle of radius Rç the centrifugal 
force and the Lorentz force are balanced, 


m(v{? CG 
=a (0) 
Rene fH. 


If at the initial moment the particle had a velocity component yp alor, g 


178 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


the z-axis in addition to its velocity in the plane (xy), then it would move 
uniformly along the direction of the magnetic field. The superposition of the 
two motions, the uniform one along the z-axis and the rotation in the plane 
(xy), leads to a helical trajectory of the particle in the longitudinal field. The 
turns of the helix lie on a cylindrical surface of radius Rç with its axis parallel 
to the z-axis. 

For the motion in a constant magnetic field H the following conservation 
laws hold: 

1. The total energy of the particle is conserved: 


e=im {(vf)? + (v6)? } =const. 


2. The projection of the angular momentum onto the z-axis is conserved, 
i.e. 


L, =mR Y= mR2 c wç = const. 


From formula (22.4) it follows that the magnetic moment produced by the 
particle moving in a circle is also conserved: 


e eR& we r 1 
w= 5 L, SeS Lv?) = FF const, (39.10) 


where ej = imo? is the kinetic energy of motion in the plane (xy). We shall 
make use of this important result in the next paragraph. 

A constant uniform magnetic field possesses the property of focussing. Let 
a beam of particles emerge in different directions from a certain point with 
different initial velocities lying in the plane (xy). Since the cyclotron fre- 
quency does not depend on the initial velocity, the particles, having per- 
formed one revolution, will again come together at one point after the lapse 
of a time interval To = 2r/wç. 

If one now considers a beam of particles emerging in different directions, 
but having the same values of the initial velocity component |v{|, then it can 
be seen that during a time Tọ they will all traverse one turn of a helix. Its 
pitch, equal to /= vOT. = y Tecosa, will in each case be different. Here œ 
is the angle between the direction of the initial velocity and the z-axis. Hence 
the particles emerging from the initial point will not come together again at 
one point. If, however, the angle & is small, so that cosa~ 1, then the pitch of 
the helix turns out to be the same for all the particles, and the beam is 
focussed. 





§39 CONSTANT ELECTRIC AND MAGNETIC FIELDS 179 


Finally, let us consider the general case of the motion of a particle in elec- 
tric and magnetic fields which are uniform and constant in time. We write the 
equation of motion in this case in the form 


dv _ e 
mG, eEtzyXH. 


We introduce a new unknown quantity V, determined by the relation 


_cEXH 





V=v (39.11) 
H2 
Substituting V into the equation of motion, we find 
av e (e+tyx u+ EXE Dau ). 
dt m c H? 
Evaluating the vector product 
(EX H) X H= H(H- E)- ER? , 
we obtain 
Mo OID) e yy. (39.12) 


dt m H me 


If the electric field is perpendicular to the magnetic field, so that H- E= 0, 
then 


dV_ e 
ae o AH 


The above equation is the same as (39.3). Consequently, V represents the 
velocity of motion of a particle in a circle in the plane perpendicular to the 
magnetic field H with the cyclotron frequency. The compönents of the 
velocity V are given by formulae analogous to (39.8), in which we assume 
that a = 0. 
In this case the total velocity of the particle is equal to 
TAV ExH 


180 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


or, in the components, 


= = (0) 
UE Veet cE,,/H = vf! cos Wet + cE ,/H 3 (39.13) 
Tye = — vf sin wor, (39.14) 


where v\ js the initial value of the velocity in the plane perpendicular to the 
direction of the magnetic field H. For the constant @ equal to zero, the corre- 
sponding initial velocity of the particle is directed along the x-axis. From 
(39.13) it follows that v, remains small in comparison with the velocity of 
light c, if the inequality Ey <H holds. 
The component of the velocity of the particle 
EXH 


inh Omen (39.15) 


is directed perpendicularly to both of the fields. Its absolute value is equal to 





lv)! = cB/H (39.16) 


and depends neither on the charge nor on the mass of the particle. 

The motion of a particle in the direction of Vp is called “drift”. 

An obvious interpretation of the phenomenon of drift can be obtained 
from the following reasoning. Let a positively charged particle move on a 
sircle in the plane (xy) perpendicular to the direction of the magnetic field H, 


Rotation of a particle Drift caused by the 
in the uniform uniform magnetic field 
magnetic field 


Positive 
charged 
particle 


charged 
particle 


S Negative 


Fig. 1.19 





<<  —°  —— os 


§39 CONSTANT ELECTRIC AND MAGNETIC FIELDS 181 


chosen to be the z-axis (in fig. 1.19). The magnetic field is directed upward 
perpendicular to the plane of the drawing. Let, in addition, the electric field 
be directed along the y-axis. Then the electric field will accelerate the particle 
as it moves on the left semicircle, and decelerate it as it moves on the right 
semicircle. The circular trajectory will be distorted. The particle traverses the 
upper part of the circle with a larger velocity than the lower part. The mag- 
netic fiela will bend the trajectory of the particle more in the lower part of 
the circle than in the upper one. Hence the projection of the path traversed 
by the particle onto the x-axis will be smaller in the lower part of the circle 
than in the upper one. As a result, after each revolution there arises a certain 
shift of the particle along the x-axis in its positive direction (in fig. 1.19 from 
the left to the right), and the particle begins to move in the positive direction 
of the x-axis. 

Similar reasoning for a negative particle leads to the same direction of 
drift. 

Integrating the expressions (39.13) and (39.14) once more, we find the 
equations of the trajectory of the particle in parametric form: 


vf j cEt 

aioe sin wot + > tXq > (39.17) 
v|” 

S Wott yo. (39.18) 


The curve described by the particle is.called a trochoid. The fixed parameters 
of the trochoid depend on the initial conditions. If it is assumed that at t= 0 
the charged particle was at the origin, then xg = 0, yo = =f /we. In this 
case the form of the curve is determined only by the value of the initial 
velocity eii For lyf | >cE/H the upper curve of fig. 1.20 is obtained, 
while for vf | <cE/H the middle curve is obtained. The lower curve, a 
cycloid, corresponds to the case vf =—cE/H. 
If Eis not perpendicular to H, then eq. (39.12) can be projected onto the 

plane perpendicular to Hand onto the z-axis. Then we find: 

dV; œe dV, eœ 

Zan wine Oe ae an 
wnere £ is the component of the electric field parallel to the magnetic field. 
In this case one has to write E instead of E, in the drift velocity. The uni- 
formly accelerated motio1: along the magnetic field under the action of the 
force eE || is superposed on the drift of the particle. 


| 182 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 











a4 
x 
—. 
x 
ee 
x 
Fig. 1.20 


§40. The motion of charged particles in slowly varying magnetic fields 


Now we apply ourselves to the very important case of the motion of 
charged particles in magnetic fields varying in space and time. In the general 
case of varying fields the integration of the equations of motion turns out to 

: be a very difficult problem. Hence we restrict ourselves to the case of fields 
varying slowly in time and space. 

Consider the case where the magnetic field varies slowly in time, remaining 
uniform in space. Let a particle rotate with a cyclotron frequency wç in the 
plane perpendicular to the magnetic field. We assume that the change in the 
field per revolution is sufficiently small, i.e. that 


ii || <IHI, (40.1) 


| where Tg = 21/we. 
f For a time-varying field 


oeeo 


§40 SLOWLY VARYING MAGNETIC FIELDS 183 
fE} oH “dS, (40.2) 


where the trajectory of the particle can be taken as the integration path. 
Multiplying (40.2) by the charge of the particle and assuming that during one 
revolution 0H/dt remains constant, we can write: 





e fE-dl= eH] s- eja H TRÈ. (40.3) 


The integral on the left of (40.3) represents the work done on the charge per 
revolution. It is equal to the increase of the kinetic energy of motion in,the 
plane (xy), which we shall denote by Ae). Thus 


e dH 
Ae, = —= TR FF 





lel mR jE 





(40.4) 


The minus sign means that a particle with a negative charge moves in a direc- 
tion opposite to the positive direction of the integration path. 

Let us find the derivative de)/dt. On the basis of (40.4) and (39.10) we 
have 





de; Ae, _ Lee aj- mbsf 
dt To at at 


“(a 


me 


a Hos 


According to definition (39.10), 


de; ƏH . ðu 


anor tee (40.6) 


Comparing eqs. (40.5) and (40.6) we see that, when the magnetic field varies 
slowly, the magnetic moment of the particle remains constant: 


(40.7) 


We can obtain the same result by considering the motion of a particle in a 
stationary magnetic field which varies slowly from point to point. 


184 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 





Magnetic lines of force 


Fig. 1.21 


Let a magnetic field be symmetric about the z-axis and let its strength in- 
crease with increasing z. The field lines converge, as shown in fig. 1.21. We 
shall assume that in a distance ~ Rc the variation of the magnetic field is 
small, i.e. 


Ro [Ejn À (40.8) 


Since the magnetic field varies along the z-axis, the radial component H, of 
the field also differs from zero. From the continuity equation in cylindrical 
coordinates 


18 aH, 
V-H=~—(rH,)+ >*=0, 


we have 
ð 0H, 
5, WHY) =-r tre 


Integrating, and disregarding the dependence of 0H,/0z .on the coordinate r 
on a circle of radius r ~ Rç, we find 


pee. 
Cane az 


Since, if the condition (40.8) is fulfilled, the component H, is small in 
comparison with the component H, for all r < Rç, it can be assumed that 
IHI. ~ H,, i.e. that the field is directed at a small angle to the z-axis, and 





H, == hp (40.9) 


r 


ðz 





§40 SLOWLY VARYING MAGNETIC FIELDS 185 


If the component of the magnetic field H, # 0, then a particle moving in a 
circle in the (xy) plane with a cyclotron frequency is acted upon by a force 
in the direction of the z-axis. The particle will drift in this direction. Then the 
component v, satisfies the equation of motion 





dv, e e 0H eye, oH _ ðH 
ni E AHE ERG əz PEE az Ea 
(40.10) 
Whence in the usual way, multiplying (40.10) by v,, we find 
Gr te avec e 
dt G mv, ) H ðt 8 
Since the total energy of the particle is conserved, we have 
d 
nE mv? +i mvj) = 0, 
from which we obtain 
dey d A 0H 
EA rae ene ce 
e G 2 eletian , AD) 


Comparing (40.11) with (40.6), we again arrive at the conclusion that the 
magnetic moment of the particle is conserved: 


The conservation of the magnetic moment in a slowly varying non-uniform 
magnetic field leads to very important consequences. Since 


cmv, c 1 
VT OS 
eH H 





ee ander 
H=.» an = 
H c eH 


the radii of the circles on whica the particle moves decrease in the direction 
of increasing values of z (see fig. 1.21). 

Let 6, be the angle formed by the velocity vector of the particle with the 
z-axis at a point Zp, and let @ be the same angle at an arhitrarv point. Then at 
the point z = zg 





186 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Chis? 
g= 2 mo? = 5 mug sin? On =u H(z) ; 


where H(zọo) is the value of the magnetic field at the point zo. At a certain 
point z the field strength is equal to H(z), and 


e; = u H(z) = 3 m sin? 0 . 


Hence we have 


HO 


sin 6 = sin 95 BEO 


As the particle moves along the z-axis and the field strength increases the 
angle increases. At a point z*, where 


(Zig) po 
H(zo) sin@o’ 








sin @ = 1 and vj = ug. This means that the velocity component v, of the par- 
ticle reduces to zero. The particle cannot move farther than the point z*, but 
is then reflected info the region z <z*. The region z >z*, which is impene- 
trable for particles with an initial velocity vj = ug sin 69, is called a magnetic 
mirror. 

The reflection of particles from the magnetic mirror plays a fundamental 
role in various electronic devices. E. Fermi put forward the idea of the accel- 
eration of particles in cosmic rays as a result of reflection from magnetic 
mirrors. The role of the latter can be played by clouds of interstellar matter. 
If it is assumed that in the clouds of interstellar matter the magnetic field 
strength is larger than in the space separating them, then all particles confined 
between the clouds will be reflected from them as from magnetic mirrors. 
Assume that the clouds are moving towards each other with velocities v. 
Charged particles, colliding with the moving clouds and reflecting from them, 
cnange their velocity by 2v for each reflection. Calculations show that in 
cosmic conditions the velocities of the particles could reach enormous values. 

In conclusion it should be stressed that all of the results obtained refer 


only to the motion of particles with velocities small in comparison with the 


velocity of light c. The motion with velocities comparable with the velocity 
of light c will be considered in Part II. 





§41 THE LAGRANGIAN AND HAMILTONIAN 187 


§41. The Lagrangian and Hamiltonian for a particle 
moving in an electromagnetic field 


The equations of motion of a particle in an electromagnetic field 
dv 1 
= = E+- H 41.1 
m dt e ( z vX ) (41.1) 


can be written in the form of Lagrange’s equations, if the Lagrangian is intro- 
duced by the relation 


2 


L=4 mv -ept Av. (41.2) 


We derive the Lagrange equations: 


L e e 
P= SON an Dit As. (41.3) 
Correspondingly, the generalized force is 
OL) _ e Ba ti 
(#) - —e Voit V(A: v) 


=—e Vet {(v: V)A+ vX (VX A)}. 


In calculating the partial derivative with respect to coordinates, v was assumed 
to be constant. 
Substituting these expressions into the Lagrange equations, we find 


d 
iy (P+£A)=-eve+2(v-w)At£ vx (VX A), 


or 


dp__edA e Eq V)A+EvX (VX A) —eVy. 


188 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 
dt c dt 


However, according to (I.18) the total derivative with respect to time will be 


dA_ 3A 


maor S V)A, 
whence 

dp_ {120A e 

dr e Ham +7yX(VXA) < 
or 


dp _ 1 
P= 6 (E+1yx H) 5 


which is the same as (41.1). 
We can also write the Hamiltonian for a particle: 


2 
de H=p-y-L=2.(p_£a)- + (p-£a) + 
ans m c 2m c 
E 
eA e 2 
tep-2&.(p_£a)= 51 FA (p-£a) +eyp. (41.4) 


In the case of a system of independent particles the Lagrangian and the 
Hamiltonian can be written in the form 


ej. 
i L= amu; — 269,427 TAG Yi 


-Dime -Devt DjA. (41.5) 


and, correspondingly, 


es 1 ei 2 
HEDD (e,-“a) +D ey» (41.6) 





§42 SYSTEM OF TWO CHARGED PARTICLES 189 


where A; and y; are the values of field potentials at the position of the ith 
particle. The summation is carried out over all particles of the system. 

Later we shall need these expressions for the Lagrangian and the Hamil- 
tonian. 

It should be noted that, if one introduces the vector potential for a con- 
stant uniform magnetic field given by formula (19.16) into the expression for 
the generalized momentum P,, then 


e 
P,=p;+5,HXr, (41.7) 


and, consequently, 
1 ej 2 
BDz; (Py GH + evi (41.8) 


§42. The motion of a system of two charged particles 
and the radiation from them 


Up to now we have considered the motion of particles in external fields. 
Now we shall discuss the problem of the motion of charged particles in the 
field produced by other particles. 

Let us now consider the problem of the motion of two interacting charged 
particles *. 

This problem can be solved by making use of the method of successive 
approximations. Namely, assuming the energy losses of the particles due to 
radiation to be small, the trajectories of the particles can be calculated in the 
first approximation. Knowing them, one can then find the radiation of the 
system. 

Let the charged particles have masses my and m3, and charges e} and e3. 
The potential energy of the system can be written in the form _ 


U=e y(r), 


* In what follows we shall confine ourselves to a brief exposition of the problem, 
since it appears to be a particular case of the two-body problem considered in detail in 
courses in classical mechanics. A detailed exposition of this and other problems touched 
upon in this paragraph can be found in: L.D.Landau and E.M,.Lifshitz, Mechanics (Perga 
mon, Oxford, 1960), and H.Goldstein, Classical mechanics (Addison-Wesley, Reading, 
Mass., 1950). 


190 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


where (r) is the potential of the field produced by the charge e located at a 


distance r from the charge e}. 
The I aerangian of the system can be written 


L=}myvz +3m,v3 - e, 9(7), (42.1) 


where V; =r}, V2 = r2, while r; and r3 are radius-vectors drawn to the par- 


ticles from an arbitrary origin. 
Since the potential energy depends only on the distance between the 


charges, i.e. U = U(|rl), where 
r=r,-I,, (42.2) 


it is convenient to pass over to the centre-of-mass system. 

That is to say, we place the origin at the centre of mass, i.e. the spatial 
point whose radius-vector with respect to an arbitrary system of coordinates 
is expressed by the formula 


mıı +m Kz 





R my +m (42.3) 
From (42.2) and (42.3) we find 
m 
r; = 2 r+ R f 
my, SF m 
my +R 
Ea sag” 
and, correspondingly, the velocities of the particles are 
m m 
vy) == r+ R= —+*_ v +R, 
my + my my, ar my 
(42.4) 
Sia +R= xi +R 
1 aay a a A a. 


` . 
where Vo =r is the relative velocity of the particles, and R is the velocity of 


the centre of mass. 
Inserting these values of the velocities into the Lagrangian, we find 








§42 SYSTEM OF TWO CHARGED PARTICLES 191 
L=3M(R)*+5u(0)?-U(irl), (42.5) 
where M = m}, + mz is the mass of the system, while the quantity 


mm, ae 
Sia mı +m (42.6) 
is called the reduced mass. 


The coordinates of the centre of mass R are cyclic. Hence the correspond- 
ing generalized momentum is conserved: 


aa =(m, +m ,)R=const. (42.7) 


The centre of mass moves with a constant velocity (in particular, it can remain 
at rest). 

In order to investigate the relative motion of the charges we introduce 
coordinates whose origin coincides with the centre of mass. Since the poten- 
tial energy of the interaction depends only on the distance r, the field has 
spherical symmetry. Therefore the coordinate corresponding to arbitrary 
rotation of the system is cyclic. This means that the angular momentum con- 
servation law holds: 


L=rX p= const. (42.8) 

Taking the scalar product of this expression and the radius-vector r, we have 
L-r=0, 

so that the motion takes place in the plane perpendicular to the vector L. 

Choosing the direction of the vector L to be the z-axis, and the plane in 

which the motion takes place to be the plane z=0, we can introduce the 


polar coordinates r, Ų and rewrite the Lagrangian referring: to the relative 
motion in the form 


Lye = FHP? +7? ý?) —e, gl). (42.9) 
The Lagrangian L,,, is formally ihe same as the Lagrangian of one particle 


having mass u and moving in the external force field êjy. The coordinate y is 
cyclic, and the generalized momentum 


192 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch.7 
Py = pew = const (42.10) 
corresponds to it. Calculating L, in polar coordinates, we have, obviously, 
L,=(rXp],=u@y—yx) =r}, (42.11) 
so that the equality 
Py =L, =const=L 


expresses the angular momentum conservation law. 
Since the Lagrangian does not depend on time, the energy conservation 
law holds: 


E=3 u(r trp?) +e y(r) = const : (42.12) 
Expressing W from (42.11) in terms of L,, we find 


E= 


ni- 


3 2 b 
w+ tep tur +V), (42.12') 
2ur2 
where the quantity V(r), called the effective potential energy, is equal to 


L2 
= tp ey ; 
V(r) =e; p0) F (42.13) 
The energy (42.12') is formally the same as the energy of a particle moving 
uniformly in a field with potential energy V(r). 

The charactor of the relative motion of the charged particles is defined by 
the form of the functic.: y(r). 

Here we shall consider the case of oppositely charged particles. Then 


g=—e,/r, (42.14) 


where e, is the charge of the second particle. 

The variation of the energy as a function of y in this case is shown in fig. 
[.22. The dashed lines show the curves —e,e>/r and L?/2ur?. The straight 
line E =0 is given by a dotted line. The shape of the curve V(r) and the 
position of the minimum depend on the value of the angular momentum L. 


| oc 
| 
| 


§42 SYSTEM OF TWO CHARGED PARTICLES 193 


Energy 











Fig. 1.22 


Rewriting the formula (42.12') in the form 


P==(E-V(n), (42.15) 


EN 


we see that the allowed region of the relative motion of the particles depends 
on the relation between £ and V(r). The regions in which V(r) > £ are for- 
bidden, whereas those in which £ > V(r) are allowed. If the rocts of the 
equation 


E= V(r) (42.16) 


exist, they determine the radii ro at which the radial velocity reduces to zero. 
if E > 0, then the allowed region oi motion extends from the infinitely dis- 
tant region r > up to the radius corresponding to the minimum distance 
between the particles and determined by the crossing of the straight line £ 
with the curve V(r). 

If E <0, then there are two radii corresponding to the smallest and largest 
distance between the charges. The motion is performed between these points. 


194 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


In order to find the trajectory, the time should be eliminated from (42.12’) 
f and (42.11). 
We then have 


dy =+ dt= £ ME da : 
ur ure h L2 
RERO) E 
m ] pr? 
By integration we find 
dr 
y= {—————— _-. (42.17) 
2H (Ee Y D 
L2 1 72 


In particular, for the motion in the attractive Coulomb field the substitution 
for y(r) from (42.14) gives 


y dr x du hi 
o= m ejer iv J ME 2ejezu 7 
r? Z (e+ = — + ———u-u? 











L2 r r? L? L2 
L?u = 
aay 
f = — arccos 2 — Yo: 
2 
We 2EL 
(eye)? 


where 


u=l1/r. 


We find for the equation of the trajectory: 


( 12 JE [ tin 2E cosy + va). (COND 


Me ,eo/ r u(eye)? 


ew 





Ta 


| Comparing this formula with the general equation of conic sections 


| p/r= 1 + ecos (Y + Vo), 


where e is the eccentricity, we see that the motion of the charges takes place 
over a conic section with eccentricity 











§42 SYSTEM OF TWO CHARGED PARTICLES 195 


2 
i 2EL 


e=VJ 1 ; 
ulejez)? 


(42.19) 


The parameter of the conic section is p = L?/ue;ez. Depending on the sign of 
E we have 


E>0, e>1 hyperbola 
E=0, e=] parabola 
BEKO; e<1 ellipse 
u(ejez)? 
JPN e=0 circle. 
2L2 


The solution (42.18) determines the motion of each of the particles. Namely, 
the particles move on trajectories representing conic sections with the foci 
located at the centre of mass of the system. 

This solution does not differ from the corresponding solution of the 
Kepler problem on the motion of planets. The values of the quantities charac- 
teristic of conic sections are given in the books on mechanics cited. 

Let us now find the radiation from a system of two charges. Here we con- 
fine ourselves to the case of periodic motion, i.e. to the case of attraction for 
E<0. The energy emitted during a period T is, according to (27.9) and 
(28.11), equal to 





Dine pes Gy awe 
~(AE),.=— f(d)? dt= al--=) F dt, 
G27 N 3c? m m/ 9 
where 
€1€e2 
F= r. 
r? 


Instead of integrating over a period, we can integrate over the angle y by 


means of (42.11). Expressing r in terms of y according to formula (42.1 1), 
we obtain 


196 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


2 Je) e7\2 dt 
SOO Ton a.) eie)? f = 
QNA 2h Ardy | 
a) (e122) Lo r2 
2 2 4 2r 
ast (= = 2) Cie ff [1+ €cos(¥ + Yo)] *dy = 
0 


4 
2 E Se (42.20) 


3c3 m m uL5 


Thus, a charge moving in a closed orbit continuously emits energy. In par- 
ticular, for motion in a circular orbit e = 0,L2= —u(e,e>)*/2E, and formula 
(42.20) reduces to the simpler form: 


(42.21) 


l67./2u se en \2 
aona (o -2 ) |g 15/2 


3c3e,e5 m m 


The energy loss due to radiation leads to the transformation of the circular 
orbit into a spiral. 

Numerical calculation shows that a system consisting of an electron rotat- 
ing about a proton emits its energy during a time T ~ 10 sec for linear dimen- 
sions of the orbit ~ 10-8 cm. 

We see that a planetary atomic model contradicts the laws of classical elec- 
trodynamics. Later, in expounding quantum mechanics, the cause of this con- 
tradiction will be explained. It turns out that the instability of the planetary 
atomic model is an illustration of the general statement on the inapplicability 
of the laws of classical electrodynamics to the consideration of intra-atomic 
phenomena. 


§43. The scattering of particles and associated bremsstrahlung 


We shall now consider processes in which the particles are in relative 
motion with either a repulsive or attractive interaction, but having an energy 
E >0. From general considerations associated with the shape of the curve 
V(r) (see fig. 1.22) it is clear that in both cases the motion will take place in 
an open orbit. 





§43 SCATTERING OF PARTICLES 197 


For simplicity we assume that one of the charged particles is at rest with 
respect to the laboratory system of coordinates. We call it the target. The 
other, incident particle, which is scattered, moves relative to the first one. At 
a sufficiently large distance, when the interaction between the particles can 
be disregarded, the motion of the incident particle is rectilinear. We assume 
that the velocity ug of this motion is given. When the particles approach each 
other the incident particle is deflected from its rectilinear motion, and the 
target particle, which was initially at rest, acquires momentum and is set in 
motion. 

The particles are said to have undergone collision with each other, as a 
result of which they are scattered. 

For reasons which will be clear from what follows, the investigation of the 
process of scattering leads to most important information about the nature of 
the interaction between the particles. 

The study of scattering processes appears at present to be the basic experi- 
mental method of nuclear physics. The investigation of the interaction of 
particles, for example, fast electrons or protons with nuclei, is usually carried 
out in the following way. A beam of particles having well defined properties 
and a known velocity is incident upon a sample of matter containing particles 
of another kind. From observations on the beam of scattered particles one 
can draw conclusions about the nature of the interaction which led to the 
scattering. 

In such an experimental set-up, scattering has the character of a bulk pro- 
cess. The behaviour of a beam, usually containing an enormous number of 
particles, is observed. However, the process is based on the individual inter- 
action between the particle to be scattered and the target particle. Hence the 
process of scattering must be characterized by a quantity which depends 
neither on the properties of the incident beam nor on the properties of the 
target material, for example, its density, but which is determined solely by 
the interaction between a particle which is to be scattered and a target par- 
ticle. 

We characterize -the incident particle beam by its intensity or the density 
of the flux of particles Jy = mug, where n is the number of particles per unit 
volume of the beam, and ug is their velocity. Jg is obviously equal to the 
number of particles traversing 1 cm? of the cross section of the beam per sec. 

We choose the position of the target particle to be the origin. Let dV 
scattered particlés per unit time enter the solid angle dQ about the scattering 
centre. We defiue the basic quantity characterizing the process of scattering — 
the differential cross section for the scattering, do — as the ratic 


198 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 
do = dN/Ip - (43.1) 


The cross section obviously has the dimensions of area. The number of 
particles scattered from the beam by the target volume V into the solid angle 
dQ during the time dż is equal to 


dN ot =dolp pV dt, (43.2) 


where p is the number density of the target-particles (pV is the total number 
of target particles) 

If the scattering occurs without a change in the energy of the scattered 
particles, then, multiplying the numerator and the denominator of (43.1) by 
the energy e of the particle, we can write the differential cross section in the 
form 


do = dl/Iy., (43.3) 


where d/ is the flux of energy carried by the particles per unit time in the 
solid angle dQ, and Jp, is the intensity of the energy flux in the incident 
beam. Formula (43.3) is the same as the definition of cross section given in 
§36. 

In addition to the differential cross section, scattering is also characterized 
by the total cross section 


o = {do ; (43.4) 


where the integration is carried out over all possible values of solid angle. 

The quantity do, expressed in terms of directly measurable quantities, can 
be connected with parameters characterizing the individual collision process. 

Consider an individual act of collision between two particles. We confine 
ourselves to the case where the internal energy of the particles remains un- 
changed. Such collisions are called elastic. It should not be thought that in 
elastic collisions the energy of the scattered particle does not change. The 
target particle receives from the scattered one a certain momentum and 
energy, whose value depends on the mass ratio of the particles. 

In inelastic collisions, for example collision of an electron with an ion, 
-there occurs an additional transfer of energy to the ion, whose internal state 
changes. Such processes are more complicated and will be considered in 
Part V. 





enn E 


| 


§43 SCATTERING OF PARTICLES 199 











Fig. 1.23 


At first we shall refer the process of scattering of two particles*to the 
centre-of-mass system. According to the results of the preceding paragraph, 
the problem of the relative motion of particles in the centre-of-mass system 
reduces to the problem of the motion of one particle with reduced mass u 
relative to the motionless force centre located at the centre of mass. 

At a large distance from the centre of mass let the incident particle move 


rectilinearly with a velocity vg. Its energy and angular momentum are respec- 
tively equal to 


2 
v 
E=—, L=pvyp=pV2ueE, (43.5) 

where p is the distance between the force centre and the straight line along 
which the particle would pass by if there were no interaction. The quantity p 
is called the impact parameter. 

For given values of £ and p the trajectory is completely determined. We 
characterize the process of scattering by the scattering (deflection) angle 0, 
‘representing the angle between the directions of motion of the particle at a 
large distance from the centre before and after the scattering (see fig. I.23). It 
is obvious that the angle 6 is the complement of the angle Yg between the 
asymptotes of the trajectory (fig. 1.24). Because of the symmetry of the force 
field and the distribution of the trajectories relative to the axis of the beam, 
the number of scattered particles and the cross section for scattering depend 
only on the angle @ and not on the azimuthal angle. 

The solid angle dÙ can therefore be written in the form 


dQ = 27 sin 6 dé. 





200 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 





Fig. 1.24 


The number of particles scattered into the solid angle dQ is correspondingly 
equal to 


dN = 2719 o(8) sin 8 dé = Tg do, (43.6) 
where 


do = 27 o(@) sin 8 dé. 


Since the trajectory of the scattered particle is uniquely determined by the 
energy and the impact parameter (E and p), there corresponds to each scatter- 
ing angle a trajectory with a definite value of the impact parameter. Hence it 
follows that the number of particles scattered at a given angle @ is equal to 
the number of particles having given value of the impact parameter at an 
infinitely large distance from the centre. 

In other words, all particles having a value of the impact parameter lying in 
the interval p, p + dp are scattered into the solid angle dQ. Hence the number 
of scattered particles can also be written in the form 


—dN =1) 2np dp , (43.7) 


where 2mp dp is the area of the ring shown in fig. 1.23 on the left. 
Comparing (43.6) and (43.7) we find 


o(@)=-F, B= P| oe (43.8) 


sinô dð sin@ 








§43 SCATTERING OF PARTICLES 201 


Since to large values of p there correspond small deflections 6, and since the 
cross section, by definition, must have a positive value, we have written the 
absolute value of the derivative. Integrating formula (43.8) on the left with 
respect to scattering angles in the interval from @ to z, and on the right with 
respect to the corresponding values of p, i.e. in the interval from p(@) to zero, 
we obtain the important relation 


420) = f o(0) sin 6 dé. (43.9) 
8 


In order to calculate the differential cross section it is necessary to estab- 
lish-a relation between the impact parameter p and the scattering angle 0. For 
this it is sufficient to calculate the trajectory and find the dependence of the 
angle Wg between the asymptotes on the parameter p. 

General formula (42.17) gives 








L dr dr 
vo=f =f P 
fe 2u(E — U(r) — L? ~ „2 / 2HE ZUW) SEN 
r? L+ r2 
wil a (43.10) 
Baye (: 2 =) Sut 
p? E r2 


The limits of the integral are determined from the following considerations. 
The angle Wo represents the change in the angle y when the particle describes 
all its trajectory. The trajectory has two branches: the motion of the particle 
from infinity to the point of closest approach rg, and its motion away from 
the point rg to infinity (see fig. 1.24). The trajectory of the particle is always 
symmetric relative to the point of closest approach rg. This follows from the 
reversibility of the process of scattering: the particle moving in the opposite 
direction must move on the same asymptote on which the particle moves 
forward (see fig. 1.24). Hence the integral over the trajectory can be written 


in the form of the sum of two equal integrals taken in the range from ro to 
infinity: 


=o —— (43.11) 


ro JX (1-20) 


202 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


where rọ is the root of the equation 


2 
E= Vro) = Ur) +E. (43.12) 
6 
Since the scattering angle is 0 = 7— Wo, formulae (43.11) and (43.12) connect 
the value of @ sought with the parameters of the collision — i.e. the quantities 
E and p — for an arbitrary form of the potential U depending only on the 
distance r. 
Let us consider, in particular, the case of Coulomb repulsion: 


U= le,e, l/r. 


Expressions (43.12) and (43.11) then take the forms 


























leje! p?E 
= N (43.13) 
To rĝ. 
or 
1 leje] leyenl\2 1 
=a oa ( 2 | hag (43.14) 
ro 2Ep 2Ep p 
and 
r d 
ip =2 ff z (43.15) 
70) 2 a (= ete 
p2 Er r2 
Introducing the new variable z = 1/r, we obtain 
le}ezl Zo 
tz 
Z0 d 2 
Vo=2f Z =— 2arccos Ziad = 
0 1 lees! leyenl\?_ 1 
aa n= peel sek 
p? p2E 2p2E p2 0 
le}ezl 
2 
= 2 arccos Zi : (43.16) 
lejes)? 1 
A22) el 
‘2p2E p? 


§43 SCATTERING OF PARTICLES 203 


In the calculation we substituted the value zg = 1/ro from expression (43.14). 
Further we find 





lejen! 
2 

cos Sy = 202E l 

(Ec 

2 2 

| or oaa p 

lejezl í leje2l s le;ezl à 
p= aR tana o7z tana (m—8) = TE cot 56. (43.17) 


Formula (43.17) gives the relation sought between p and @. Substituting the 
value of p into (43.8), we have 


O p £ 1 (eje2)? 1 
o = orl ee a A 
sing ldo] 16 E2? sint +6 








(43.18) 


Formula (43.18) gives the differential cross section for the scattering of par- 
ticles which repel each other according to Coulomb’s law. Since o ~(e,e2)?, 
it is clear that an identical result is obtained in the case of attraction (for 
E> 0). The expression found for the cross section is called the Rutherford 
formula. It was obtained in connection with experiments on the scattering of 
a-particles by atoms which allowed Rutherford to establish the nuclear struc- 
ture of atoms. 

Formula (43.11) can be inverted and the form of the function U(r) can be 
found from the results of experiment. For this it is necessary to consider 
(43.11) as an integral equation with respect to the unknown function U(r). 


Under certain (very general) assumptions about the form of the function U(r) 
eq. (43.11) can be solved *. 
We write 0 in the form 


O=n=—Vo=n— | ——— eee ao (43.19) 


ro ry Eo r?— p? 


As for U(r) we shall assume that it decreases with increasing r for r °°. We 
introduce a new function 


* We follow the work of O.B.Firsov, Zhur. Exp. i Teoret. Fiz. 24 (1953) 279. 


204 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Chi) 
F() = (: = A) r?, (43.20) 


and go over from integration with respect to r to integration with respect to 
F. Since 


dlnr? 
dF 





2% = od Inr= dF, 


and, using (43.12), 





U(r) 
F(r9) = (: =" r =p, F(2°) = œ (43.21) 
we have 
2 
Fœ) p 4 ne dF 
@=2—-f =. (43.22) 
Fro) VF =p? 
Writing 





dlnr? _ f1n(2 ,dinF 
dF dF F wa 


we find, substituting for the bounds F(ro) and F(=) the values (43.21), 


cop in (5) a o p ERT yy 
dF 





2 


oah dF r =f 


a VEE p> VF=p2 





The last integral is easy to calculate, because 


ff ps id abe 2 arctan VEER 2 s2, 
p2 F /F— p2 p p oP Pp 





Hence 


din (4) 
r'2 dF’ 


dF’ VF'— p2 i 


(43.23) 


Q= 
Te 


noe oe o — 


§43 SCATTERING OF PARTICLES 205 


We multiply both sides of (43.23) by the factor 1/./p2—F and integrate 
with respect to the parameter p in the range from p = VF to p >. Then we 
have 








A 

E CON ie z d 1n (55) dF’ 
ae 

VF Vp2-F VF 2? dF Vp? FY (Fp?) 


We change the order of integration with respect to p and F’, writing 


= Gija mdp 
dF’ VF! \/—p4 + p2(F + F') —FF' 








Il 
ES 
a 
Ey 
™ 
S| 
~ 
o> 








Further, 
=yVyEF 
i Pp VF dp2 is 
2 i + z 
p= FV -pf +p? (F+ F')—FF 
hp?  |e=VF 
=== arcsin Ao =in. 
V(F + F')? —4FF' |p- JF 





The fact that, as a result of integration with respect to p, a factor independent 
of F' is obtained allows us to reduce the integration with respect to F’ to an 
elementary transformation. In order to obtain a more explicit result we 
return, before making the substitution of the limits, to the variable r, writing 


F(e°) 


ff E n(E)e S Lin Era dr’ = 
Fr) dF r 4 Gl (r) 


sa (-) 





n (-2)--n5 


r r2 


206 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


According to the above assumption we put U > 0 as r >œ. 
As a result we arrive at the equality 


Bi = |f Soe (43.24) 


r IENA F 


or 


Vea -if Sane (43.25) 
r Nr [p2 -F NT, 


The last expression contains integration only with respect to p. If the depen- 
dence o(@) is known, then by virtue of (43.19) the functional dependence 
8(p) is also given. Hence formula (43.25) gives the function F(r) in implicit 
form. 

A particularly simple result is obtained for the scattering of particles with 
a large energy, for which U(r) << E for all values of r. Then formula (43.23) 
can be rewritten, making use of the equalities 


Ly U(r) aoe a 


Da” 





~ 6(p) do _ if Opa f odp 


eee Peg 1-2) vp2—( TO T 


and we find the potential energy sought: 


ug) ~ = f LEED (43.26) 


7 Vp? =e 


The above calculations of the scattering cross section have been carried out in 
the centre-of-mass system. For their practical use it is necessary to make a 
i transition to the laboratory system. 
The importance of this transition is seen from the general picture of the | 
process of scattering from the point of view of the two coordinate systems. In 








§43 SCATTERING OF PARTICLES 207 








Fig. 1.25 Fig. 1.26 


the centre-of-mass system each of the particles, the incident and the target 
one, move on a trajectory determined by formula (43.10). From the fact- 
that the total momentum of. the system of two particles, referred to the 
centre-of-mass system, is equal to zero it follows that the two particles move 
in opposite directions with equal momenta (fig. I.25). On the contrary, from 
the standpoint of the laboratory system the incident and target particles are 
not equivalent. Before collision the target particle was at rest. After collision 
it is set in motion (fig. I.26). 

Our problem is to establish a relation between the angles ĝo m in the 
centre-of-mass system and 8; in the laboratory system. For this we putin the 
same diagram the velocities of the scaterred particle after collision in the 
centre-of-mass (Vi. m ) and laboratory (vj) systems. It is obvious that equalities 
(42.4) and (42.7) hold. The vector R coincides with the direction of motion 


of the incident particle. Hence the angles sought for are respectively equal to 
(fig. 1.27): 


Onn F AV R) 2 


; (43.27) 
9,=L(v;, R), 








— a 


208 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


where the prime denotes that the values of the velocity are taken after colli- 
sion. For brevity we omitted the index 1 occurring in (42.4). 
From fig. 1.27 it is clear that 


v m sin 0 
tan 0, = a. (43.28) 
Uo.m cos 8e.m y | RI 
According to (42.4) and (42.7) we can write 
; LA ‘ 
Re tm YO? since V] = Vo, V2 = 0, 
Vit = m Yo? since Vi m =Vo—R. 
Substituting into (43.28), we have 
sin 0 
tan 6, = — < =— (43.29) 
UA 
cos 8 + 
c.m m 


Formula (43.29) establishes * the relation between the angles 6, m and 4). 

It should be noted that, if the mass of the target particle is m, >> my, 
then 0} ~ 0, ,,- The meaning of this result is obvious: a very heavy particle 
obtains no momentum from the incident particle and remains at rest — the 
centre-of-mass system coincides with the laboratory system. 

Knowing the relation between the scattering angles in the laboratory and 
centre-of-mass systems, we can express the corresponding cross sections in 
terms of each other. Namely, writing the number of particles scattered into 
the solid angle dQ, and dQ, ,, in the two coordinate systems 


dN = 27I 0,(8,) sin 8, 48, , 


dN = 2719 0, Osm) sin Oom dð 


0 “c.m c.m ’ 


we find the relation sought for between the cross sections in the laboratory 
(o,) and centre-of-mass (0, ,,) systems: 


* It should be stressed that this refers only to elastic collisions, because in such colli- 
sions the relative velocitv before and after the collision has the value vo- 





sii eee 


§43 SCATTERING OF PARTICLES 209 


sin 0. m c.m 
(61) = Tom Oem) sin@,d0, ` Eas 


In contrast to the differential cross section, the total cross section is the same 
in all coordinate systems: 


o= 27 | 0,(8,) sin 6, dð, = 2n foom Orm) Memm 
(43.31) 
We shall also dwell on the calculation of the energy lost by the scattered 
particle. From fig. 1.27 we can write, on the basis of the cosine theorem, that 


(l m)? = ©)? +R? — 2R(v}) cos 8, . (43.32) 
Substituting the values of R and V..m and denoting by E} and E} the energy 


of the scattered particle respectively before and after collision in the labora- 
tory system, we find 


a 2u vE m- m; 
Fe cos raa (43.33) 
m, VE, mz+m, 


Eq. (43.33) determines the dependence of E£} on the scattering angle 6, and 
the mass of the particles. The largest energy loss takes place when m =m 
and 6; = i m (in this case 0, m = 7). In this case £} = 0 and the energy is com- 
pletely transferred to the erect particle. 

We shall not dwell on other details of the theory of scattering which are 
covered in the books cited. 

Now we turn to the calculation of the radiation arising when a beam of 
charged particles passes by charges at rest. This phenomenon, called brems- 
strahlung, gives rise to X-rays (with a continuous spectrum) and plays an 
important role in the deceleration of high-energy particles moving in matter. 

Since it is just high-energy particles that are of basic interest, we shall 
restrict ourselves to this case. 

When the incident particle has a sufficiently large energy its scattering 
angle is small, if the improbable processes of central collision are excluded. 
Considering the trajectory to be almost rectilinear, one can simplify the 
corresponding formulae. 

Let us first of all consider the radiation emitted by one particle. Locating 
the origin at the scattering centre, one can write the following expression for 
the components of forces acting on the incident particle: 


210 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


GOs » e1€2 
a x, aman (43.34) 


Since the motion takes place in the (xy) plane, the component F, of the 
force is absent. 

Assuming that the velocity of motion of the particle is constant and is not 
changed by the scattering, and that the deflection from the rectilinear trajec- 
tory is small *, one can put in the formulae for the force components 


xvt, yprp, dtx~dx/v. (43.35) 
The energy emitted by the particle into the solid angle dQ in passing by 
the scattering centre is found by integrating the formula (28.11) (in which it 


should be assumed that m, > °°) with respect to the transit time. Taking into 
account (43.35), we have 


“as 
ne 


+00 


ada [FX n]? dx. 


Developing the expression [F X n] 2, we have 
e Aa a 2 
[EX n]* = F*—(F-n) Fis tei (Fyn, +Fyny) 


= 2) p2 n2 Z 
ANV E E Ey ah y FF ny Ny » 


where n, and n, are the components of the unit vector nin the direction of 
the solid angle dQ. 
We now have to integrate three integrals. The first of them is 


+00 +00 


2d 
F2 dx =e? X? dx= 2e2e2 eaS 
iin eje? a T x=2e af G CH 


* L.D.Landau and E.M.Lifshitz, Mechanics (Pergamon Press, Oxford, 1960). 


§43 SCATTERING OF PARTICLES 211 


Putting x/p = tan y, we find 


mee 2 2 2 a x? dx 
f FY dx = 2e1e3 ife ESE = 
—co 0 (x tp ) 


eves W2  tan2y dy 
=2 > = 
p? 9 (tan?ġ+1)3 cos? y 





ete} n/2 
PR 





sin? y cos? y dọy= 


P> iO 

252 252 
PA a Silt) 
= p dy 3 
2p? 9 8p 


The second and third integrals are calculated in an analogous way: 


AA Z dx 3re?e? 
f F? dx = 2e2e3 p? f e > 
A o r7 8p 





J FF, ax=0, 


since FF, is an odd funstion of x. Finally, 


ef ed 1 2 2 
32c3um? A {a =n) +3(1 =n;)} dQ = 
4,2 
eje 1 
=e {4— sin? @ cos? v — 

32¢3 um? p? 


— AE dQ = 


— 3 sin? 0 sin? y } sino do dy . (43.36) 


The energy emitted is inversely proportional to the velocity and third 
power of the impact. parameter Q, and depends in a rather complicated way 


. 
| 
: 
| 





212 MOTION OF PARTICLES IN ELECTROMAGNETIC FIELDS Ch. 7 


on the angles. A simple calculation gives for the total irradiated energy 


42 
m eje> 1 
—AE, ,=— | AEdQ=— =e 
Gai f 3 c?°m?v p? 





(43.37) 


In practice the calculation of the energy loss of a fast charged particle per 
unit path length in matter is of basic interest. The target centres deflecting 
the particle are nuclei with a charge e = Ze, where Z is the atomic number 
of the element and e is the charge of an electron (scattering by electrons can 
be disregarded). In scattering on each nucleus the particle emits energy deter- 
mined by formula (43.37). Multiplying (43.37) by the number of nuclei per 
unit length in a cylindrical shell of radius p, p + dp and integrating over all 
impact parameters, we find for the energy loss 


dE f ROR 2n*e4 Z*e2N j dp 
ENN np dp = —————— —= 
dx to: s 33 mtv a p? 
min 
2n?Z7e*e4N 1 
SS 5 (43.38) 


3c3mitv Pmin 


where N is the number of nuclei per cm? and p,,;, is a certain minimum value 
of the impact parameter. 

If there were no minimum limiting distance, formula (43.38) would lead 
to a senseless result: the energy loss of the particle due to radiation along its 
path in matter would be infinitely large. However, in reality it turns out that 
the laws of classical physics become inapplicable at small distances from the 
nucleus. The calculation of the minimum value of the impact parameter can 
be carried out only on the basis of quantum mechanics (see also §17 of 
Part II). 


PART II 


THEORY OF RELATIVITY 








General Principles 


of the Theory of Relativity 


§1. The creation and significance of the theory of relativity 


The development by the end of the 19th and the beginning of the 20th 
century of the theory of the electromagnetic field, and the improvement of 
experimental methods of investigating electromagnetic processes allowed in- 
vestigators to carry out a persistent search for direct proofs of the existence 
of the hypothetical ether. Within the scope of this book we cannot describe 
the history of this search, which did not lead to the discovery of the ether 
but to the development of a new view of the physical world and to the com- 
plete rejection of the concepts of space and time which had been established 
in physics by the end of the last century. 

These classical concepts were closely associated with the successes of 
classical mechanics. The principles of the classical view of physics can de 
expressed briefly in the following words: 

1. A physical phenomenon can be considered to be thoroughly understood 
only when a mechanical model of it has been constructed. 

2. The only possible form c? physical law is a dynamical law of classical 
mechanics. As is known, in classical mechanics it is assumed that the specifi- 
cation of the forces which are acting and the initial conditions completely 
determines the motion of any mechanical system. Thus, the initial state 
determines completely the behaviour of the system at any subsequent instant. 
It is this statement which is contained in the idea of a dynamical law. 


215 





216 THEORY OF RELATIVITY Ch. 1 


3. All physical processes take place in space and time. The properties of 
space and time are established in classical mechanics. Any physical theory 
must be constructed according to mechanics. 

It was assumed that the properties of space reduce to: 

1) the equivalence of all directions (isotropy); 

2) the equivalence of all points of space (homogeneity); 

3) the Euclidean nature of space. 

It was assumed that, although the motion of physical bodies always takes 
place in space, the bodies in no way affect the properties of space. 

In classical mechanics it was also considered possible to introduce a unique 
universal time flowing uniformly and equally, independent of the state of 
motion of physical bodies. 

' The creation of the theory of relativity by Einstein in 1905.led to a radical 
revision of ideas on the properties of space and time and the character of the 
electromagnetic field, and to the denial of the necessity and possibility of 
constructing mechanical models for all physical phenomena. The theory of 
relativity played a paramount role in the further development of contempo- 
rary physics, in particular, atomic and nuclear physics. This role consisted not 
only in making use of the important relations of the theory of relativity. The 
theory was the first to show that classical concepts, obtained from every-day 
experience, which seem to be obvious, turn out to be inadequate in going to 
new fields of investigation. Hence it can be correctly stated that the appear- 
ance of the theory of relativity signified the beginning of the development of 
a new, non-classical physics. 


§ 2. Galilean transformations 


In order to characterize the motion of bodies in space it is necessary to 
make use of a system of physical bodies between which an interaction, for 
example an electromagnetic interaction, exists. In addition it is necessary to 
make use of a method for the measurement of time. A method of measuring 
time, called a clock, is provided by any periodic process: Then, knowing the 
velocity of light and the time required for the light to travel from one body 
to another, one can determine the distances between the bodies. A set of 
bodies provided with clocks and located at distances determined in such a 

| way is called a reference frame. Only when one makes use of a.reference 
frame one can speak about a definite law of motion of a body in space. If the 
position of a body is referred to a reference frame at every instant of time, 


° 


§2 GALILEAN TRANSFORMATIONS 217 


then the set of all positions of the body in space forms a trajectory, and the 
sequence of the points of the trajectory represents a law of motion. 

Any set of bodies, moving according to arbitrary laws, can be chosen as a 
reference frame. However, in the following we shall be interested in so-called 
inertial frames. By inertial reference frames we shall understand those frames 
in which Newton’s law of inertia holds. In other words, in inertial reference 
frames the motion of bodies which are not acted upon by external forces is 
uniform and rectilinear. The special role of inertial reference frames is asso- 
ciated with the fact that in them the motion has its simplest form. In non- 
inertial reference frames, for example in a. rotating reference frame, even the 
simplest rectilinear and uniform motion is described by very complicated 
relations. 

Our problem is the comparison of the laws of motion of a body in differ- 
ent reference frames. If a certain physical law does not change in the transi- 
tion from:one reference frame to another, then we say that it is invariant 
under this transformation. 

Long ago it had been established that mechanical phenomena are equiva- 
lent in all inertial reference frames. In other words, the laws of classical 
mechanics are invariant under transition from one inertial reference frame to 
others. 

Let us consider two reference frames K and K’ moving relative to each 
other. We shall call K’ the moving frame, and K the frame at rest. The relative 
character of such a terminology will be particularly clear from what follows. 

It is easy to obtain relations between the velocity and position of a moving 
body with respect to the two inertial reference frames. We direct the x-axis 
and x'-axis along the velocity vector v of the relative motion. Then the rela- 
tive motion will take place only along the positive direction of the x-axis. 
Moreover, we match the origins of the two frames at the initial instant t= 0 
(fig. II.1). 





218 THEORY OF RELATIVITY Ch. 1 


It should be noted, according to the above, that in order to find the laws 
of transformation from reference frame K to K’, the law of inertial motion of 
a given body must have the same form in the two reference frames. That is to 
say, in both frames the acceleration of such a body is the same and equal to 


Zero, i.e. 


On integrating we find 

X=xX' tv, y=y', ZEZ. (2.1) 
The second integration gives 

x=x't+tut, y=y', z=z'. (2.2) 


Here we have tacitly assumed that time has an absolute character and is the 
same in all reference frames. For completeness we should then write 


(Sire (2.3) 


Formulae (2.2) and (2.3) are the Galilean law of transformation, while 
formula (2.1) is called the law of addition of velocities of classical mechanics. 
Of course, formulae (2.1) and (2.2) can easily be written also in vector form, 
without specifying the choice of orientation of the coordinate axes, as fol- 
lows: 


r=r +y, r=r't+ve. (2.4) 


The invariance of the laws of classical mechanics in the transition from one 
inertial reference frame to another is expressed mathematically by the fact 
that they are invariant under a Galilean transformation. This means that if in 
Newton’s equations 


one makes the substitution x >x', y > y',z > z', i.e. passes from the frame K 
to the frame K’, they will remain the same, if the law of transformation of 
coordinates and time is the Galilean law (2.2)—(2.3). Indeed, since the equa- 





§3 ATTEMPTS TO DETERMINE AN ABSOLUTE VELOCITY 219 


tions of motion involve only accelerations, under Galilean transformations we 
have 


which is the same as the equations of motion in the reference frame K. 

It should be emphasized that the reference frames K and K’ are completely 
equivalent. We could consider the transition from the reference frame K' to K 
with equal success. 

Thus, a uniform and rectilinear motion of the reference frame has no 
effect on mechanical processes taking place among a system of material 
points. This statement is called the Galilean principle of relativity. It should 
be noted that the term “the Galilean principle of relativity” was introduced 
in association with the creation of the theory of relativity. The term “‘rela- 
tivity” emphasizes the complete equivalence of inertial reference frames. The 
terms “‘at rest” and “a uniform and rectilinear motion” are of relative charac- 
ter. In classical mechanics only a relative motion has meaning. On the other 
hand, the ideas of absolute rest and absolute motion have no real significance. 
The principle of relativity in mechanics is usually formulated by the words 
“the uniform and rectilinear motion of a system of material points has no 
effect upon the inner motion of the system”. The principle of relativity in 
classical mechanics (the Galilean principle) is restricted to inertial reference 
frames. 

The Galilean principle of relativity is based on the concepts of classical 
physics about space and time. This principle, as well as formula (2.4) for the 
addition of velocities which results from it, is confirmed by such a vast 
amount of experimental: evidence, in particular that associated with the 
phenomena of the world around us, that it is adopted as being self-evident. 


§3. Attempts to determine an absolute velocity 


Shortly after the formulation of the Maxwell-Lorentz theory of the elec- 
tromagnetic field the problem arose of its generalization to the case of moving 
bodies. 

There is, however, a profound difference between the equations of classical 
mechanics and those of electrodynamics. 

Maxwell’s equations involve a characteristic velocity — the velocity of pro- 
pagation of electromagnetic waves in a vacuum (the velocity of light). Hence . 





220 THEORY OF RELATIVITY Ch. 1 


they are not invariant under Galilean transformations. One can easily verify 
this by the direct substitution of the velocity c by the sum (c +v). 

The question naturally arose as to the reference frame with respect to 
which the velocity of light is to be measured. The Lorentz classical electro- 
dynamics appeared to give an unambiguous answer to this question: it is 
measured with respect to a hypothetical medium called the universal ether. 

The ether was endowed with the properties of an all-pervading, homogene- 
ous and isotropic medium, motionless and filling all space. In the Lorentz 
theory the existence of an absolute isolated reference frame was assumed. To 
move meant to move with respect to the ether, and the velocity of motion 
with respect to the ether was the absolute velocity. 

Thus in the Lorentz theory, as distinct from classical mechanics, a decisive 
attempt to renounce the principle of relativity was made. The fact that the 
Maxwell-Lorentz equations, in contrast to the Newtonian ones, turned out 
not to be invariant under Galilean transformations, appeared to be a direct 
consequence of the renunciation of the principle of relativity. 

It is clear that the basic problem confronting electrodynamics at the end 
of the 19th century was the problem of the experimental determination of 
an absolute velocity and of obtaining direct proof of the existence of the 
ether *. 

We cannot here go into the history of the search for the ether, which can 
serve as an example of the inventiveness and persistence of many investiga- 
tors. 

We shall consider only the basic ideas of two possible experiments. Let a 
source and detector of electromagnetic waves be mounted ona body moving 
with a velocity v relative to the motionless ether. If the source—detector 
direction coincides with the direction of motion of the body with respect to 
the ether v, then light will traverse the source—detector distance / in a time 
Tı =1/(c—v). By measuring the time 7, one can find the velocity v with 
respect to the ether. However, since c is very large, and the velocities attain- 
able at the end of the last century were small, such a measurement was 
beyond the feasible accuracy of experiment. It was possible, however, to 
compare the time 7, with the time T, during which light traverses the same 


* See, for example, R.Becker, Electromagnetic fields and interactions (Blackie, Lon- 
don, 1964); W.Panofsky and M.Phillips, Classical electricity and magnetism (Addison- 
Wesley, 1964); and, in particular, L.I.Mandelshtam, Sobr. soch. (Collected papers), Part 
V (Izdatelstvo Akademii Nauk SSSR, 1950). In these books the reader can acquaint 
himself with the history of the problem, as well as with the detailed methods of carrying 
out the experiments. 





§3 ATTEMPTS TO DETERMINE AN ABSOLUTE VELOCITY 221 


distance / in a direction perpendicular to the velocity v. During the time T3 

the detector traverses a path v7> with respect to the ether, so that the total 

path traversed by the light from the source to the detector is’ equal to 
lŻ + v*T5. Correspondingly, for the time Ta we have 


VIP +0? T3 


T,= 3 ; 
or 
lfe 
T,= / r 
2 
fat 
; c? 


By making the rays travelling from the source to the detector along the 
direction of the velocity v interfere with those in the perpendicular direction, 
it was possible to determine the difference between T} and T} and thus the 
velocity v with a high degree of accuracy. 

In 1881 such an experiment was carried out by Michelson, who made use 
of the velocity of the orbital motion of the Earth as the velocity of the 
source. 

He made the rays which traversed the path from the source to the detec- 
tor in the direction of motion of Earth interfere with those in the direction 
perpendicular to it. It should be noted that nowadays the accuracy of meas- 
urements by electronic techniques allows one to measure the difference 
(Ti - T2) directly, without having recourse to interference. 

To the surprise of Michelson’s contemporaries, no difference between the 
times T} and 7, could be detected. It turned out that 7, and T, were equal 
to each other to a very high degree of accuracy. 

The following direct experiment may serve as another basic experiment. 
Let a source of light be moving, and a detector be at rest with respect to the 
ether. Then the dependence of the velocity of light on the velocity of the 
source can be found directly. In 1912 De Sitter proposed choosing the radia- 
tion of the so-called binary stars as radiation coming to the Earth from the 
moving source. Binary stars represent tw» stars close to each other, rotating 
about a common centre of mass. By observing the velocity of light emitted 
as the star was moving towards the Earth and when it was moving in the 
opposite direction (after half of the period of revolution) one could deter- 
mine the velocity of the star with respect to the ether. However, here also no 


effect of the motion of the source on the value of the velocity of light was 
discovered. 


222 THEORY OF RELATIVITY Ch. 1 


A number of attempts were made to explain the negative result of this and 
many other similar experimc>ts (for example, a change in the law of inter- 
action between charges with a change in the value of their absolute velocity; 
see §20). However, all these attempts turned out to be unsuccessful. The 
solution of the problem was given only in the Einstein theory of relativity. 


§4. Postulates of the Einstein theory of relativity 


The negative result of Michelson’s experiment led Einstein to revise the 
basic concepts of classical physics and, above all, the notions on the proper- 
ties of space and time. 

As a result he created the theory of relativity, which is also called the 
special theory of relativity. 

The theory is based on two principles or postulates: 

i) Einstein’s principle of relativity, 

2) the principle of the existence of a limiting velocity of propagation of 

` interactions. 

According to Einstein’s principle of relativity, a uniform and rectilinear 

motion of bodies has no effect on processes taking place in them. In other 
words, al! laws of nature are the same in inertial reference frames. If in a 
certain inertial reference frame an arbitrary law of nature is expressed in the 
form of an equation in which a physical quantity is a function of coordinates 
and time, then, performing the transformation of the coordinates and time to 
another inertial reference frame, we must obtain the same functional depen- 
dence of the physical quantity on the new coordinates and time. This state» 
ment is briefly formulated by the words: “the laws of nature are invariant 
under a transformation from one inertial reference frame to another”. It is 
clear that Einstein’s principle of relativity is a generalization of the Galilean 
-principle of relativity. The latter established the relativity of inertial motion 
and the impossibility of introducing the notions of absolute motion and 
absolute rest within the framework of classical mechanics. The negative 
result of Michelson’s experiment, as was first realized by Einstein, pointed 
out that the ideas of absolute motion and rest have no meaning in the theory 
of the electromagnetic field. 

However, there is a profound difference between the Galilean and Ein- 
stein’s principle of relativity. In the latter the transformation from one 
inertial reference frame to another is not associated with formulae for the 
transformation of coordinates and the law of addition of velocities of classical 
mechanics. Indeed, as an example, Maxwell's equations do not satisfy these 


“Pree 


§4 POSTULATES 223 


transformations. Hence, in Einstein’s theory, a new law of transformation of 
coordinates and time had to be found for going from one inertial reference 
frame to another. The second postulate of the theory of relativity stating that 
any interactions between bodies propagate in vacuum with a universal finite 
velocity equal to the velocity of light in vacuum, c = 3 X 1010 cm/sec, and 
independent of the motion and state of the bodies serves this purpose. It is 
obvious that this postulate expresses directly the result of Michelson’s ex- 
periment. 

The second postulate of the theory of relativity was closely associated with 
the development of electrodynamics. It demonstrated clearly the inadequacy 
of the theory of action at a distance of classical mechanics. In electrodynam- 
ics it was established that there exists a finite velocity of propagation of 
electromagnetic interactions, numerically equal to the velocity of light in 
vacuum. Theoretical studies carried out later on by Einstein in association 
with the development of the so-called general theory of relativity showed 
also that the gravitational interaction has the nature of waves propagating in 
vacuum with the velocity of light. It is beyond doubt that the specific inter- 
action between nuclear particles has the character of a short-range action. 

It cannot be excluded that further development of physics may lead to 
the discovery of new forms of interaction. However, the principle of the limit- 
ing velocity of propagation of interactions expresses the hypothesis that the 
velocity of propagation of interactions in vacuum has a universal character 
and is associated directly with the properties of space and time and not with 
the physical nature of the interactions. 

The existence of the limiting velocity of propagation of interactions indi- 
cates that there is a Certain connection between intervals of space and time. 
This connection will be demonstrated more clearly in analysing the conclu- 
sions of the theory of relativity: At the same time, the existence of the limit- 
ing velocity automatically assumes the restriction of the velocity of motion of 
material bodies to the value c. If any particles could move with a velocity 
higher than that of light, then these particles could bring about an interaction 
between bodies with a velocity higher than the limiting velocity. Thus, 
Einstein’s second postulate restricts the value of all velocities of motion 
possible in nature and the velocity of propagation of {nteractions to the 
value c. 

The principle of the existence of a limiting velocity of propagation of 
interactions is closely associated with Einstein’s principle of relativity. Indeed, 
it is easily seen that, if the velocity of propagation of interactions depended 
on the velocity of particles or on the nature of the interaction (i.e. if it were 
different for-the electromagnetic and gravitational interaction), then the 





| 
| 
i 


224 THEORY OF RELATIVITY Ch. 1 


principle of relativity would be violated. For example, if the velocity of light 
depended on the velocity of a rectilinear and uniform motion of the source 
of light, then the latter could be determined experimentally. 

The propagation of interactions is, in the theory of relativity, often called 
the propagation of signals. By a signal one understands any interaction be- 
tween bodies at a finite distance from each other and in a state of rest o1 
relative motion. The principle of the existence of a finite velocity of propaga- 
tion of interaction is called the principle of existence of a finite velocity of 
propagation of signals. The overall content of the theory of relativity follows 
from its two postulates. In particular, formulae for the transformation of 
coordinates and time, in place of the Galilean transformation formulae, result 
directly from the basic postulates of Einstein. At present both postulates of 
the theory of relativity are confirmed by a whole set of experimental data 
obtained in investigating atomic and nuclear processes, motions of fast par- 
ticles in accelerators, and other devices. 

In what follows we shall present a number of examples illustrating the last 
statement. 


§5. The Lorentz transformation 


Proceeding from the postulates of Einstein’s theory of relativity formu- 
lated above one can find the law of transformation relating space coordinates 
and time in two reference frames in uniform rectilinear motion relative to 
each other. 

- lLetx,y,z,tandx',y’,z’, t be the coordinates and time in inertial refer- 
ence frames K and K’, and v the velocity of their relative motion. 

There are no grounds for assuming that the time in the frame K’ coincides 
with the time in the frame K, as was implicitly assumed in classical physics. 

To simplify the calculations we shall choose the direction of velocity to be 
the direction of the x-axis and x’-axis, as is shown in fig. II.1. 

We assume that at a certain instant £’ at the point with coordinates (x’, y’, 
z') there occurs a physical process, which we shall call for brevity an event. 
Our problem is to find the “coordinates” of this event in the reference frame 
K, i.e. to find the values (x, y, z, t) characterizing the same physical process 
in the reference frame K. 

To establish the analytical relationship between the values (x, y, z, f}and 
(x', y'> Z, t') we consider the propagation of a spherical electromagnetic wave’ 
in the two reference frames. 

We choose as the time origin, t= 0, the instant at which the origin of the 


enemas 


eee 


§S LORENTZ TRANSFORMATION 225 


reference frame K’ coincides with the origin of the reference frame K. Let the 
spherical electromagnetic wave begin to propagate at the instant ¢= 0 from 
the origin. In the frame K the equation of the wave front has the form 


x2 +y? +z? -e =0, (5.1) 


Since, according to Einstein’s principle of relativity, the law and velocity of 
propagation of a wave must be the same in all inertial reference frames, in 
addition to (5.1) we can with equal justification write the equation of the 
spherical wave in the reference frame K’: 


(')? +(y')? LE) A = x2 +y2 +22 o =O. (5.2) 


The formulae for the transformation of coordinates and time must: 1) not 
violate the relations (5.1) and (5.2), and 2) be linear. The requirement of 
linearity is associated with the homogeneity of space, which means that there 
are no points which stand out in particular for their properties. 

First of all we note that, since the motion of the reference frame K’ is 
along the x-axis, the transformation of the coordinates (y, z) must have the 
form 


yoy, z'=z. (5.3) 


The law of transformation of x’ in terms of x can be written from the follow- 
ing considerations: if at the instant ¢ = 0 the origins of the reference frames K 
and K’ coincided, then the coordinate of the point x’ =0 in the reference 
frame K is written as follows: x = ut. Consequently, in the most general case 
one can write that 


x' = a(v)(x—v¢), (5.4) 
where the coefficient a(v) depends only on the velocity of the relative 
motion. 

Making no arbitrary assumptions about the coincidence of time in the two 


reference frames, we can write ¢’ in the form of a linear homogeneous func- 
tion of x and t: 


t'=Bttyx. (5.5) 


The coefficients 6 and y can, generally speaking, depend on the velocity v. If 


226 THEORY OF RELATIVITY Ch. 1 


it turned out that +=0 and $ = 1, then we would come back to the Galilean 


transformation. E : 

In order to determine the coefficients a, B and y, corresponding to the 
requirements of Einstein’s principle of relativity, we have to substitute (5.4) 
and (5.5) into (5.2). This gives 


a(x — vt)? +y? + z*—c7(Bt + yx)? =x? +y? +z? er. 
In order for this identity to be satisfied it is necessary to equate the coeffi- 
cients of x2, t? and xt: 


a-e? =], 


Pye Be =e, 


av + c*By= 0. 


From these three equations we find the unknown values of a, B and y: 


We have chosen the positive sign of the square root. 
Substituting the values of a, 6 and y into the formulae for the transforma- 
tion of coordinates (5.4) and (5.5) and taking into account (5.3), we find 


Seis 4 (5.6a) 
V 1 —v2/c? 

MAE (5.6b) 

z’ =Z, & < (5.6c) 


ta = ux/c? 


V 1 —v2/c? 


(5.64) 


Formulae (5.6a)—(5.6d) are called the Lorentz transformation formulae. 





a 


- 


§5 LORENTZ TRANSFORMATION 227 


According to Einstein’s principle of relativity these transformations replace 
the Galilean transformations. 

Before going on to a discussion of the consequences of the Lorentz trans- 
formation we shall write formulae for the inverse transformation from the 
frame K’ to K. On the basis of Einstein’s principle of relativity the reference 
frames K and K’ are completely equivalent. We could repeat all the previous 
reasoning, taking the initial frame to be K’ and not K. However, in this case 
the velocity of the relative motion is equal to (—v) and not to v. Hence we 
obtain: 


e VE om (5.7a) 
Vi —v2/c2 

y=y', (S.7b) 

BOE (S.7c) 
U + 2 

=f toux/e* (5.74) 


V 1 —v?/c2 i 


The same result is found if the eqs. (5.6a)—(5.6d) are solved with respect to 
non-primed quantities. 

The importance of the consequences following from the Lorentz transtor- 
mation leads us to stress once more that their derivation is based only on Ein- 
stein’s principle of relativity, the principle of the constancy of the limiting 
velocity. of propagation of interactions, and the assumption of the uniformity 
of all points of space and time. These propositions are at present confirmed 
by a vast amount of experimental material and their validity is beyond any 
doubt. 

A remarkable feature of the Lorentz transformation is the fact that, at 
relative velocities of motion small in comparison with the velocity of light, it 
goes over into the Galilean transformation. Indeed, at v<<c one can dis- 


‘regard values of the second order of smallness containing (v/c)? and write 


xx tut, tt, 


which are the same as formulae (2.2) and (2.3). 

Thus, in the limiting case y << c the transformation laws of the theory of 
relativity and classical mechanics are the same. This means that the theory of 
relativity does not reject the Galilean transformation as incorrect but includes 


—— 


228 THEORY OF RELATIVITY Ch. 1 


it in a valid law — the Lorentz transformation — as a particular case holding 
at velocities of motion which are small in comparison with the velocity of 
light. 

In what follows we shall see that this reflects the general relation between 
the theory of relativity and classical physics. The laws and relations of the 
theory of relativity go over into the laws of classical physics in the limiting 
case of velocities small in comparison with the velocity of light c. 


§6. Consequences of the Lorentz transformation. 
Space and time intervals 


The Lorentz transformation leads to conclusions radically contradicting 
the usual notions of ‘the properties of space and time which arose from every- 
day experience and which we formulated in §2. Indeed, from the Lorentz 
transformation it follows directly that concepts of space and time intervals 
are relative. In other words, the ideas “the size of a body” or “the time lapse 
between two physical events” have no absolute character and are different in 
different reference frames. 

Consider first of all the concept of spatial extension (length). Let there be 
a body at rest in a certain reference frame K’. We shall henceforth call this 
body the scale. The scale is not acted upon by any forces which could deform 
it and change its size. We denote by Ly the length of the scale in the direction 
of motion (x’-axis). This length, measured in that reference frame in which 
the scale is at rest (frame K’), will be called the proper length of the scale. By 
means of a Lorentz transformation we shall find the length of the scale in the 
reference frame K, i.e. the length of the scale moving with a velocity v relative 
to the frame K. In the reference frame K’ let the coogdinates of the beginning 
and end of the scale be respectively x and x. We shall find these coordinates 
in the reference frame K. 

Since the scale moves relative to the reference frame K, it is necessary, in 
order to measure its size, to fix the coordinates of its beginning and end at 
the same instant, measured in the reference frame K. For the realization of 
this measurement it would be possible to fix at an instant ¢ the position of 
the beginning and end of the scale by means of a light signal coming from the 
reference frame K’. 

At a certain instant ¢ let the beginning and end of the scale have the coor- 
dinates x; and x% in the reference frame K. By means of formula (5.6a) we 
find 





§6 CONSEQUENCES OF LORENTZ TRANSFORMATION 229 
; Xg Vi 4 ‘Xy—ul 
eee 
v v 
Vv 1-—= 1-— 
c? c2 


or, denoting the difference between the coordinates of the beginning and end 
of the scale (the length of the scale in the reference frame K) by L, we obtain 


L=LyV1-—. (6.1) 
c 


We see that the length of a scale moving with a velocity v relative to 
the reference frame K turns out to be y 1 —v2/c? times smaller than its 


proper length. This contraction of the size of a body is often called the 
Lorentz contraction. Since the dimensions of the scale in the direction per- 
pendicular to the velocity remain unchanged, the volume of the scale turns 
out to be connected with.its proper volume by the formula 


aioe ee (6.2) 


cz 


Thus, the length and volume of a scale which does not undergo the action 
of external forces turn out to be relative quantities. In other words, the 
statement that the distance between two points of space is equal to L has no 
meaning without specifying to which reference frame this quantity is referred. 
The distance between two points depends on the motion of the reference 
frame. 

In classical physics the absolute character of the notion of the length of a 
scale was considered as something self-evident. Here lies the fundamental 
difference between views on the properties of space in the theory of relativity 
and classical physics. 

It should be borne in mind that the two reference frames K and K' are 
completely equivalent. Hence, if the scale is at rest in the reference frame K, 
then its length in the reference frame K’ will be smaller than in the frame K 
in the same ratio. There is a complete reciprocity between the two reference 





=. Ae ee 





230 THEORY OF RELATIVITY Ch. 1 


frames. One can verify this by a direct calculation using the Lorentz transfor- 


mation formulae (5.7a)—(S.7d). 
It is easy to show that the negative result of Michelson’s experiment 


follows automatically from the existence of the Lorentz contraction. Indeed, 
when light passes along the direction of motion of the Earth and in the oppo- 
site direction, from the point of view of an observer at rest (to which all the 
reasoning of §3 refers) the length / must be reduced y1 —v2/c2 times. From 
the point of view of the observer at rest the time 7, needed for a ray of light 
to traverse the total path is equal to 


2 5 





It should be stressed that the contraction of length — the compression of a 
body in the direction of motion — is of a purely kinematical character. No 
internal tensions causing a deformation arise in the body. In this sense one 
can speak about a “rigid” or, more precisely, non-deformable body in the 
theory of relativity. On the other hand, the notion of an absolutely rigid 
body is incompatible with the inferences of the theory of relativity. Indeed. 
if one assumes the existence of an absolutely rigid body, i.e. a body with 
invariable distances between all the particles constituting it, then such a body 
could be used for the transmission of an interaction with a velocity as large 
as one likes. A blow delivered to one of its ends would be transmitted to the 
other end with infinitely large velocity. Hence, from the point of view of the 
theory of relativity, the existence of absolutely rigid bodies cannot be as- 
sumed, even as an idealization. 

The concept of time also undergoes a very fundamental modification in 
the theory of relativity. 

At a point x’ in the reference frame K’, let a certain physical process take 
place during a time interval At = t% — t}, where r and t are the time of the 
beginning and end of the process. Then in the reference frame K for the 
instants ¢; and fz we can write 





§6 CONSEQUENCES OF LORENTZ TRANSFORMATION 231 








1 , UX UX 
urea t) + 
at c? z c? 
2 > a 
J i Ric 
D DP 


Subtracting we find the time interval from the beginning to the end of the 
process in the reference frame K: 


Ato 
At =t,- t =——— (6.3) 


2 
fee ca 


c2 


The time Ato, measured in the reference frame moving together with the 
body in which the process takes place, is called the proper time. 

Formula (6.3) shows that the proper time Ato between two physical 
events is VI —v2/c2 times smaller than the time lapse between these events 
in the reference frame K. 

In the theory of relativity it is customary to speak about the comparison 
of the clock run in different inertial reference frames. The clock is under- 
stood to be an arbitrary periodical process. Then it can be said that the time 
shown by the clock depends on the velocity of its motion. A clock moving 
relative to a certain reference frame runs more slowly from the point of view 
of this frame than a clock at rest in it (the latter clock being identical with 
the moving one) 

Thus, as distinct trom Newtonian physics, the course of time turns out to 
be dependent on the state of motion. There is no universal time, and the 
notion of the time interval between two physical events turns out to be rela- 
tive. Here it should again be stressed that there is a complete reciprocity 
between the reference frames K and K’. The above reasoning could be re- 
versed. If a physical process takes place at the point x in the reference frame 
K and lasts Aż, then in the frame K’ it will last 


pee 
At'= Ca 0 OS At : 
JE YE Tei 
c2 c2 c2 


as is seen from the Lorentz transformations (5.6a)—(5.6d). 
Formula (6.3) for the change in rate of the moving clock has been experi- 
mentally confirmed in several ways. The most obvious of them is the follow- 





232 THEORY OF RELATIVITY Ch. 1 


ing. In cosmic rays the decay of the positive and negative muon (y*- and 
p—-meson, with a mass of 207 electron masses) into a positron (electron) and 
two neutrinos is observed. The decay has been observed for -mesons deceler- 
ated almost to rest as well as when they are moving with a velocity close to 
that of light. The lifetimes of a moving meson and a meson at rest are con- 
nected by the relativistic relation 


_ Trest 


Since v is close to the velocity of light, T moy must be considerably larger than 
Trest: 

A number of experimental methods permit the determination of the value 
Of Trest Which turns out to be 2 X 10-6 sec. If the lifetime of mesons were 
not dependent on the velocity, they would traverse a path equal to VT est ~ 
600 m (for v =c). In reality, as shown by measurements, the mesons decay 
after traversing a path of about 20 km. To such a range there corresponds the 


lifetime 
20 km 


= ~ -5 a 
Te A =~7X 10 sec ~ 50T est 





The relativistic change of the lifetime in this case turns out to be a very large 
effect. 

Two questions naturally arise. First, why before the creation of the theory 
of relativity were all available experimental data in agreement with the New- 
tonian ideas of the absolute character of the length of the body and a unique 
universal time. Second, is the contraction of the length of moving bodies and 
the siowing down of the running of a moving clock real or apparent. 

The answer to the first question is very simple. Before experiments de- 
signed to observe the motion of the Earth relative to the ether physicists did 
not encounter processes taking place with objects which moved with a veloci- 
ty comparable to the velocity of light c. In other words, the velocities of all 
bodies observed in physics before the discovery of the electron were small in 
comparison with the velocity of light. 

For velocities which are small compared with the velocity of light one can, 
with a sufficient degree of accuracy, make use of the earlier notions of space 
and time. Moreover, at the time of Einstein’s creation of the theory of rela- 
tivity Michelson’s experiment was the sole undeniable indication of the inade-_ 


§6 CONSEQUENCES OF LORENTZ TRANSFORMATION 233 


quacy of classical physics. As will be seen from-what follows, in the course of 
the past 50 years the situation has changed radically. The theory of relativity 
has become one of the bases of contemporary theoretical physics and a num- 
ber of branches of experimental physics. Numerous experimental confirma- 
tions of the theory of relativity will, in part, be considered below. In particu- 
lar, atomic and especially nuclear physics, as a rule, investigate the processes 
and behaviour of moving particles whose velocities are very close to the 
velocity of light. All the basic relations of the theory of relativity are widely 
used in nuclear physics for purely practical calculations. Some of these will 
be given below. 

In considering the second question it should be emphasized that the rather 
widely used terms “an apparent contraction of the scale” and “an apparent 
change of the running of a clock” are inconvenient. Usually the authors wish 
to stress by the term “apparent” the purely kinematical character of the con- 
traction. At the same time, the contraction of the scale and the slowing down 
of the running of the clock are a real and objective fact which is by no means 
connected with any illusions of the observer. It goes without saying that all 
values of the length of a given scale or time interval obtained in different 
reference frames are equivalent. They are all “correct”. The difficulty in 
understanding these statements is associated solely with our habit of consider- 
ing the notions of length and time interval as absolute notions, whereas in 
reality they are relative ones. Hence it is also senseless to ask which length of 
the scale ıs the true one and which is the apparent one, as it is senseless to say 
“in reality a given body is moving (or is at rest)”. The concepts of length and 
time interval are as relative as are the concepts of motion and rest. 

The true character of the contraction of a moving scale can be illustrated 
by the following example. Let there be two charged bodies A and B at rest in 
two separate reference frames. They are each spherical in the reference frame 
in which they are at rest. If one of the bodies, for example A, moves together 
with the reference frame K, then from the point of view of the reference 
frame K’ its longitudinal (with respect to the direction of motion) dimension 
undergoes a contraction, whereas its transverse dimensions remain unchanged. 
The interaction between the bodies A and B then corresponds to the interaction 
between a charged sphere B and a charged ellipsoid A. In this case, from the 
point of view of the reference frame K’, the body B is a sphere and the body 
A is an ellipsoid, whereas from the point of view of the reference frame K the 
body A remains a sphere and the body B transforms into an ellipsoid. How- 


ever, the magnitude of the A—B interaction will be the same in the two 
reference frames. 


} 
f 234 THEORY OF RELATIVITY Ch. 1 


i 
y \ An example of the calculation of the interaction between rapidly moving 
f electric charges will be considered in §20. 


§7. Einstein’s law of addition of velocities and angular transformations 


An important consequence of the Lorentz transformation is the relativistic 
law of addition of velocities, which in the theory of relativity replaces the 
Galilean law of addition of velocities. 

It is found most simply by writing the Lorentz transformation formulae 
for the differentials of the space coordinates and time: 


AN + 
dx = 2% vdt A 
2 
Nne e 
c2 
dy=dy', 
dz = dz’, 
ar +24 
c2 
dt= 
2 
pene 
c2 


Let x’, y',z' be the coordinates of a material point moving in the reference 
frame K'. The components of the velocity of the point in the reference frame 


K’ will be 
n Gee 1 _ dy! 1 _ dz 
(OS oe ES Soe eee 
Samdty Yat ~ ak 
and in the reference frame K 
POX on ody. = dz 
ceed vida? z dr 


Dividing the differentials of the coordinates by the differential of time, we 
find 


EE S 


§7 VELOCITIES AND ANGULAR TRANSFORMATIONS 235 





Yee (7.1) 
dt usu 
1+—~ 
c2 
2 
i v 
u 1-—-— 
dy 4 c2 
ig 3 a alte eg (7.2) 
aed u,v 
Le 
e2 
2 
DARTRO 
u 1-—-— 
dz ” c? (73) 
Z. rdt uv ‘ j 
c2? 


Formulae (7.1)—(7.3) are called Einstein’s law of addition of velocities. They 
replace the formulae of addition of velocities (2.1) of classical mechanics. 

At v <<c Einstein’s law of addition of velocities goes over directly into 
(2.1), since in this case 


u,v y2 
netan, a 
2 zZ 


As was to be expected, it follows from formulae (7.1) and (7.2) that the 
velocity of light c is the limiting velocity. If, for example, a particle in the 
reference frame K' moves along the x-axis with a velocity u = u, = c, then in 
motionless reference frame K its velocity is 





If a particle moves in the reference frame K’ with a velocity smaller than 
the velocity of light, for example 
u =u =c- (a>0), 
while the reference frame K’ moves relative to K with a velocity v=c—B 


(8> 0), then the velocity of the particle relative to the motionless frame K is 
equal to 





236 THEORY OF RELATIVITY Ch. 1 


Thus, the sum of two velocities, each of which is smaller than the velocity 
of light c, will always be smaller than the velocity of light. The sum of two 
velocities, one of which is equal to c and the other smaller than c, is equal to 
the velocity of light. 

From the law of addition of velocities it follows directly that the value of 
an angle is relative and changes in transition from one inertial reference frame 
to the other. Since 


tan 6 = ufu; 5 
where @ is the angle formed by the velocity vector of the particle and the 
x-axis, we find from (7.2) and (7.1) that 


2 
uJ 1 — Z sin 0" 


tan 0 =", (7.4) 
utu cos@ 


where u, =u’ cos 6’, u, =u’ sin 6’. 

The last formula expresses the law of angular transformations in the theory 
of relativity. It connects the angles 6’ and @, formed by the velocity vector 
with the x'-axis and x-axis respectively. 

In conclusion it is emphasized that the velocity v should be a velocity with 
which a real body can move or a real process of interaction (signal) can 
propagate. One can imagine, without coming into conflict with the theory of 
relativity, processes having a velocity higher than the velocity c but which 
have a kinematical character and cannot transport bodies or realize inter- 
actions. 

Consider, for example, the velocity of motion of an imaginary point P, at 
which a line A intersects a line B as the line B rotates. If the angle a at which 
A intersects B is arbitrarily small, and the length of the moving line is arbit- 
rarily large, the velocity of motion of the point P can also be arbitrarily large. 
However, the motion of the imaginary point of intersection of the lines is not 
accompanied by a transfer of energy and cannot serve as a means for the 
transmission of signals and interactions. 


§8 SIMULTANEITY 237 


§8. Simultaneity, short-range action and action at a distance 
è 

Let two physical events take place simultaneously at a time t', at points x} 
and x% in an inertial reference frame K’. From the point of view of classical 
physics two events which are simultaneous in one reference frame take place 
simultaneously in all other inertial reference frames. 

In the theory of relativity the situation is different. 

Consider the inertial reference frame K relative to which the frame K’ 
moves with a velocity v in the positive direction of the x-axis. In the reference 
frame K the first event takes place at an instant 


z 
A 


c2 
The second event takes place at an instant 
Xv 
t+ wae 
c2 
t = 


2 
2 
M E 


c2 


Consequently, in the reference frame K the events do not take place simul- 
taneously for there is a time lapse 


v (x5 — x1) 
At=t,-t, =———_*. (8.1) 
1D u2 
1 -— 
22 


Moreover, depending on the sign of (x5 —x}) the time interval Aż can be 
positive or negative, i.e. in the reference frame K “the first” event can take 
place earlier or later than “the second” one. 

Thus, the concept of simultaneity turns out to be relative. 

The sole, but very important exception is the case when two events occur 
at the same place, i.e. at an instant t' at a point x’. Then, according to (8.1), 
in all inertial reference frames (for any v) At =0, i.e. the two events take place 
absolutely simultaneously. 

The above illustrates particularly clearly that the theory of relativity is 
incompatible with the notion of action at a distance. Two events can be ina 





238 THEORY OF RELATIVITY Ch. 1 


mutual relationship as cause and effect only where they occur at the same 
place simultaneously as is required by the concept of short-range action. If, 
on the contrary, the cause and effect could be spatially separated (and the 
interaction propagated with an infinitely large velocity), then there would 
always exist an infinitely large number of inertial reference frames in which 
the effect would precede the cause. 

It should be stressed that the concepts of relativity and simultaneity were 
assumed from the very beginning as the basis of the theory of relativity (in 
the form of the principle of a finite limiting velocity of propagation of inter- 
actions) and they can be considered only as an obvious example of the inter- 
nal consistency of the theory. 


§9. Absolute values in the theory of relativity. Intervals and proper time 


The theory of relativity revolutionized the’ teaching of classical physics 
about the absolute character of space and time. 

The relative character of space and time intervals seemed to be so para- 
doxical that the authors of a number of popular expositions of the theory of 
relativity, which appeared in particular around 1920, presented the ideologi- 
cal content of the theory by the scathing, but absolutely untrue aphorism: 
“The theory of relativity showed that all in the universe is relative”. 

In reality the situation is just the opposite. The task which the theory of 
relativity sets itself is to find absolute laws of nature which do not depend on 
the choice of the inertial reference frame *. 

Thus, the theory of relativity by no means denies the existence of absolute 
quantities and concepts. It states only that a number of concepts, considered 
as absolute in classical physics, for example the magnitudes of space and 
time intervals, are in reality relative. 

In this connection the opinion was often expressed that the term “theory 
of relativity” is unsuitable, since it does not reflect the content of this branch 
of physics. It was pointed out that the theory of relativity could with more 
justice be called “the theory of physical invariance”. However, it should be 
borne in mind that at the time the theory of relativity was first introduced its 
critical aspect — the establishment of the relativity of space and time inter- 
vals — appeared to be more fundamental and new. 


* In the general theory of relativity, which we cannot go into within the framework 
of this book, the problem of finding absolute laws of nature is extended to arbitrary ref- 
erence frames. 


§9 ABSOLUTE VALUES IN THEORY OF RELATIVITY 239 


The problem of finding absolute expressions .for the laws of nature is 
closely connected with the finding of invariant, absolute quantities. The first 
of such quantities is the universal velocity of propagation of interactions — 
the velocity of light c. The second, also very important invariant quantity, is 
the so-called interval. 

The notion of interval in the theory of relativity is a generalization of the 
usual notions of the interval (i.e. distance) between two points and the inter- 
val (i.e. lapse of time) between two events. Let there take place at a point in 
space with coordinates (x, y, z) at a time £, a physical phenomenon which we 
shall call an event. At another point x,,y ,,2), at an instant ¢; another event 
takes place. Then the quantity 





s=ve%(t, - )? -@, - x) -01 -»)? -@, -2)? (9.1) 


is called the ‘interval between the two events. 

The invariance of the interval with'respect to the Lorentz transformation 
can be checked by a direct calculation. In the moving reference frame K’ we 
have 





s =V c(t — 1’)? +(x) -x')? - 04, -»'? -@ -2'? 





Obviously, 
ong _ Gy — x)? tut- 9? - 206, - x) (C1 - 9 
(Gy =e r= : 
1 — v?/c? 
2 2 v? 
c(t- 1)*—2v(x, -x)(t- 4) Sy (x, —x)? 
GLY Oo 


1 —v2/c?2 
( -vY=0,-»)?, 
eiz) =(2,- z)?. 


The substitution of these expressions into s’ with some elementary calcula- 
tions gives 


Ss =S. 


Thus, the statement: “two physical events are separated by an interval s” 
is of absolute character. It is valid in all inertial reference frames. 


240 THEORY OF RELATIVITY 





Ch. 1 


The interval between two events taking place at infinitesimally close points 
with an infinitesimally small time separation is often considered. In this case 
the interval between two events is 


ds = Vc? dt? — dx? — dy? — dz? . (9.2) 


The value of the interval s can be real or imaginary, depending on the sign 
of the expression under the radical. 
Consider at first the case of a.real interval: 


c*(At)? > (Ax)? + (Ay)? + (Az)? . 


In this case it is always possible to find some reference frame in which the 
two events take place at the same point. For this it is necessary that the con- 
dition 





Ve? (At)? — (Ax)? — (Ay)? — (Az)? = c At’ 


be fulfilled. In principle it can always be fulfilled for a positive value of the 
expression under the radical. Hence real intervals are called “time-like inter- 
vals”. In particular, it is obvious that if two events take place involving the 
same physical system, the interval between these events is time-like. Indeed, 
during the time Aż between two successive events the system can traverse a 


path 
V(Ax)* + (Ay)? +(Az)* <cAt, 


since its velocity is always smaller than the velocity of light. As an example of 
a time-like interval one can cite the interval between two events representing 
successive indications of the same clock. 

An imaginary interval is called a ‘“‘space-like interval”. If two events are 
separated by a space-like interval, then one can always find a reference frame 
in which they take place at the same instant. For this it is necessary that the 
equality 





Vc2(At)? — (Ax)? — (Ay)? — (Az)? =i V (Ax) +(ay’)? Az)? 


be fulfilled. This equality always holds for negative values of the expression 
under the radical at the left. 

We shall now return to the determination of the proper time and show 
that the proper time as well as the interval is an invariant, absolute quantity. 








§9 ABSOLUTE VALUES IN THEORY OF RELATIVITY 241 


Let the inertial frame K’ be given. At a certain point x’, y’, 2’ let there take 
place two successive events separated by a time interval dfo. It should be 
stressed that the time fy is measured by a clock which is at rest in the refer- 
ence frame K’ (or, as is said, by the proper clock of the reference frame K’), 
and that the time df is the proper time which has elapsed between the two 
events. The interval between the two events is by definition equal to 





ds = Vc? (dtp)? — (dx')? — (dy')? — (d2')? = c dt - 
Thus, the proper time is connected with the interval by the relation 
dty = c7! ds (9.3) 


and is invariant. 

The proper time can be expressed in terms of the time dż in an arbitrary 
reference frame, i.e. in terms of the time measured by a clock moving with 
respect to K’ with a velocity (—v), by substituting the expression for ds into 


(9.3): 


dtg =i Ve? d — dx? —dy? — dz? = 


: ee AE ei 


ih Wil, (9.4) 
c2 


The finite interval of the proper time, tọ, is equal to 
to 2 
to= { V1-S ar. (9.5) 
0 


It should be emphasized that formula (9.5) is derived for the case of the 
motion of the clock together with an inertial reference frame, i.e. a motion 
with a constant velocity. 

Formula (9.5) is often applied to accelerated motion, considering v as a 
function of time. However, it should be borne in mind that an accelerated 
motion of a reference frame cannot be considered in the special theory of 








1 ee _— — 


242 THEORY OF RELATIVITY Ch. 1 


relativity. Hence, in the case of accelerated motion, the quantity fo deter- 
mined by formula (9.5) does not have the meaning of proper time but is a 
convenient quantity, invariant under Lorentz transformations. 


§10. The invariance of physical laws under Lorentz transformations. 
Four-dimensional formulation of the theory of relativity 


According to the principle of relativity all physical laws — the laws of 
mechanics, electrodynamics, statistical physics and so on — must be the same 
in all inertial reference frames. This means that all physical laws must be for- 
mulated in such a way that they may remain invariant under Lorentz trans- 
formations. In the following the relations which are invariant under Lorentz 
transformations will be called relativistically invariant or Lorentz-invariant 
relations. 

The equations of mechanics which are invariant under Galilean transforma- 
tions obviously do not satisfy the requirement of invariance under Lorentz 
transformations and, consequently, their form must be altered. On the con- 
trary, the laws of electrodynamics (i.e. Maxwell’s equations, as will be 
shown later) were formulated from the very beginning in such a way that they- 
turned out to be relativistically invariant. From the point of view of the 
theory of relativity, Maxwell’s‘equations are a model of “a correctly formu- 
lated” physical law. 

The realization of the general aims of the theory of relativity — i.e. finding 
relativistically invariant forms of physical laws — had a great effect on the 
overall further development of physics. Thus, for example, the progress made 
in the course of the last several decades in quantum mechanics and, in par- 
ticular, in the quantum theory of fields, has been closely associated with the 
fulfilment of the requirements of the theory of relativity. 

It should be noted that the requirement of the invariance of physical laws 
under certain transformations of reference frames is not a specific feature of 
the theory of relativity. It is well known that the requirement of invariance 
of physical laws under a rotation of the reference frame is associated with the 
isotropy of space. 

Indeed, every physical law is formulated in such a way that the quantities 
involved in it refer to a certain system of coordinate axes. It is clear that the 
content of a physical law cannot depend on the orientation of the coordinate 
axes in space. For example, the Newtonian equations 


Zz 


mx =F, my =F, , mz =F (10.1) 


§10 PHYSICAL LAWS UNDER LORENTZ TRANSFORMATION 243 


do not depend on the orientation in space of the system of coordinate axes 
(x, y, z) to which the components of forces and accelerations refer. The 
equations of motion remain unchanged under any rotation of the axes of 
this reference frame: under rotation, every one of the components of the 
acceleration and force transforms according to one and the same law, so that 
the equalities (10.1) are not violated. This property of the invariance of 
physical laws under rotations of the reference frame can be formulated more 
precisely in the following way. In classical physics all physical laws are for- 
mulated in the form of equalities of the type 


a= (10.2) 
or 
a=B, (10.3) 
Qik = Bix - (10.4) 


The first of these contains the relation between scalar quantitiés, which 
remain unchanged under rotation of the coordinate axes. The second one 
relates vector quantities, which change under rotation of the coordinate axes. 
That is to say if for simplicity of notation one restricts oneself to the rotation 
about the z-axis through an angle y, then the x-component and y-component 
of vectors change according to the well-known formulae of analytical geom- 
etry: 


x=x' cosy—y' sing, 
(10.5) 
y=x' sing +y' cosy. 


Since, however, the components of any vectors, in particular those of the 
acceleration and force vectors, change according to this law, the equality 
(10.3) (ðr the equality (10.1) which is a particular case of it) is not violated. 

The equality (10.4) shows in a general form that the tensor order of the 
quantities on the two sides of it are the same. 

Thus, every physical law must be formulated in such a way that it will 
contain only quantities of the same tensor order. In classical mechanics the 
laws of transformation of coordinates which must leave physical laws un- 
changed reduce to the following: 

1. Invariance under Galilean transtormations. 


2. Invariance under spatial translations and rotations of the coordinate 
frame. F 


244 THEORY OF RELATIVITY Ch. 1 


3. Invariance under the replacement of t by (t + 7’), expressing the unifor- 
mity of the flow of time. 

4. Invariance under the change of the sign of time t >-—t, indicating the 
reversibility of the laws of mechanics, which are symmetric with respect to 
the future and past. 

The theory of relativity puts forward, instead of condition 1, the inore 
general requirement of the invariance of physical laws under Lorentz trans- 
formations. We shall often call this invariance relativistic invariance. 

The conditions 2—4 are also preserved in the theory of relativity. 

At first sight it might seem that the diversity of physical laws and the 
physical quantities occurring in them rule out a general method of approach 
to the establishment of their relativistically invariant formulations. In practice, 
however, this is not so. 

In order to find a general method of working out relativistically invariant 
expressions we again turn to the expression for the interval ds. 

We introduce a completely formal quantity 


t= ict, (10.6) 


which will be called the fourth coordinate or the imaginary time. It goes 


` without saying that 7, as an imaginary quantity, has no direct physical mean- 


ing. 
By means of the imaginary time the interval can be written in the form 


—di? = dx? + dy? + dz? + dr? (10.7) 


In this notation Lorentz transformations take on a new interpretation. 

We assume, without going into the physical content of our treatment for 
the present, that the quantities x, y, z, 7 are orthogonal coordinates in a 
certain imaginary four-dimensional space. 

The Lorentz tran$formation represents a linear transformation of the four 
coordinates (x, y, Z, T) which leaves the quantity ds? unchanged. It is easy to 
see that, from the geometrical point of view, the quantity ds? represents the 
square of the distance between two points in a four-dimensional space. Con- 
sequently, the Lorentz transformation is a linear transformation which leaves 
the distance between two arbitrary points in this space unchanged. From 
geometry it is known that there are only two such linear fransformations: 
parallel displacement-and rotation. Parallel displacement represents a trivial 
transformation which amounts to the change of the origin of the system of 
the coordinates x, y, z, t. Hence the only linear transformation leaving the 





————— | 


a e 


Pee a 


§10 PHYSICAL LAWS UNDER LORENTZ TRANSFORMATION 245 


value of the interval unchanged is a rotation in the four-dimensional space 
(x, yz, 0): 

In what follows we shall confirm this conclusion by a direct calculation. 

Such a geometrical interpretation of the Lorentz transformation, which is 
due to Minkowski, allows one to draw conclusions directly about the rela- 
tivistically invariant form of physical laws. 

Namely, in order that an expression may be relativistically invariant it 
must have the form 


a=b, (10.8) 
where a and b are scalars, or 


ay = by» (10.9) 
where a, and by are four-dimensional vectors having four components 
(a=x,y,z, 7) and, in general, 


rors aU ae (10.10) 


where aggy... and bagy... are four-dimensional tensors of an arbitrary rank. 
Under rotations of coordinate axes in the space (x, y, z, T) all quantities in- 
volved in the relativistically invariant expression transform according to one 
and the same law, so that equalities of the type (10.8)—(10.10) are not 
violated. These conditions of invariance in the case of the Minkowski four- 
dimensional space represent a direct analogue of the conditions of invariance 
under a rotation of the reference frame in the real three-dimensional space. 

It should be stressed that the introduction of the idea of a four-dimen- 
sional space with coordinates x, y, z, 7 has a formal character. It is in no way 
equivalent to the statement of the existence of a real space with four dimen- 
sions. ; 

The time coordinate 7 is a purely imaginary quantity, which underlines its 
special character and basic difference from the space coordinates x, y, z. 
Nevertheless, the introduction of the time coordinate r has a profound 
physical meaning. It points to the indissoluble connection between space and 
time, about which we have already spoken. 

Later we shall have to reduce a number of important physical laws and 
relations to a relativistically invariant form. For this it is necessary to find 
their four-dimensional generalization. How this is to be done will be seen by 
concrete examples. 


nr gener ell 


246 THEORY OF RELATIVITY Ch. 1 


In the meantime we shall verify the fact that the rotation transformation 
in the four-dimensional space (x, y, z, 7) is identical with the Lorentz trans- 
formation. For simplicity of notation we assume, as before, that the motion 
of inertial reference frames is performed in the direction of matched axes.x 
and x’. In a four-dimensional interpretation this corresponds to a rotation in 
the plane (x, 7), the orientation of the axes y, z remaining unchanged. 

If the-rotation angle is denoted by y, then by analogy with (10.5) one can 
write the relation between the initial coordinates (x, 7) and the transformed 
coordinates (x’, 7’): 


x=x'cosy—7 sing, (10.11) 
T=7 cosyt~x’' sing.. (10.12) 


The rotation angle y must, obviously, be different for different values of the 
velocity v. We write the transformations (10.11) and (10.12) for the 7 -axis 


.of the reference frame K’, i.e. for points which have x’ = 0. Then, obviously, 


x=-7 siny, 
T=T cosy. 
Dividing the upper expression by the lower one, we obtain 
tan y = —x/7T, 
or 
tan p=ix/ct=iv/c, (10:13) 


where v is the velocity of uniform motion of the origin of the coordinate 
frame K’ (the point x’ = 0) with respect to the reference frame K. From the 
equality (10.13) one can easily find the values of the quantities siny and 
cos y involved in formulae (10.11) and (10.12): 


3 1 el 
COS ae 
Vi+tan?y JŽ 
r2 
i v/e 


sin y= tan y cos y = ; 
v2 
IS 
c2 





—_—— ~~ 


§10 PHYSICAL LAWS UNDER LORENTZ TRANSFORMATION 247 
Hence 

x —i-T 
0 eg aT (10.14) 

_2 

c2 

Tt +i-x' 
eS (10.15) 

2 

/ v 
1 
c2 


Passing over from 7 to a time ¢, we see that formulae (10.14) and (10.15) are 
the same as the Lorentz transformation. 

It is not out of place to stress the conditional character of the graphical 
representation of Lorentz transformations. The rotation angle in fig. I.2 is 
an imaginary angle. Of course, we cannot picture a rotation through an 
imaginary angle. The merits and shortcomings of the graphical representation 
of Lorentz transformations are clearly seen from the following. Let a clock 
be at rest at a certain point x’ in the reference frame K’. This physical event 
at a moment 7} is represented by the first point, and at a moment 7} by the 


v 






N 


AAA 





Fig. 11.2 


248 THEORY OF RELATIVITY Ch. 1 


second point on the axis 7’. The time interval Ar’ is equal to the length of the 
section from point 1 to point 2. In passing over to the reference frame K 
(rotation through an angle y) the segment Az’ turns into a segment Az on the 
axis 7. We see clearly that it makes no sense to speak about which one of the 
reference frames is more correct and which one of the time intervals A7 and 
Az’ is the true time interval between the two physical events. A shortcoming 
of the geometrical consideration is the fact that the interrelation between 
Ar and A7’ in the drawing is the reverse of the true one: in the drawing Ar is 
smaller than Az’, whereas in reality it is larger in the ratio 1/\/1 —v2/e 
The misrepresentation is due to the fact that we cannot show in the drawing 
the imaginary value of the angle y and replace it by a real angle. 

The length transformation (fig. II.3) and the Einstein theorem of addition 
of velocities allow an analogous geometrical interpretation. To the addition 
of velocities v; and vz tnere correspond two successive rotations in the plane 
(x, 7): to the rotation through an angle y, there corresponds the transition 
from the reference frame K to the. reference frame K’ moving with the 
velocity vı with respect to K; to the rotation through an angle y, there cor- 
responds the transition from the reference frame K' to a reference frame K” 
moving with the velocity vz with respect to K’. Consequently, to the transi- 
tion from K to K”, i.e. to the addition of velocities v} and v2 there corre- 
sponds the rotation in the space (x, 7) through an angle 


=p t¥2- 

Velocities v} and v2 are connected with rotation angles y, and y by the 
relation (10.13) which involves the tangent of the rotation angle. To an angle 
y there corresponds a tan y given by 
d tan y; + tan p2 
tan y= tan (y, + 9) = ———_————. 

l — tan y, tan y2 
Substituting the values of tan y, tan y} and tan p, from (10.13) into the 
above relation, it is easy to arrive at the Einstein theorem of addition of 
velocities. 


§11. Four-dimensional vectors and tensors. 
Four-dimensional velocity and acceleration 


We now pass on to four-dimensional vectors which, according to the 
results of the preceding paragraph, must figure in relativistically invariant 
expressions. 


§11 FOUR-DIMENSIONAL VECTORS AND TENSORS 249 


First of all we introduce the four-dimensional radius vector rg (a= 1, 2, 3, 
4) whose projections onto mutually orthogonal coordinate axes are equal to 
XO Zs T- 

In the Lorentz transformation — a rotation in the four-dimensional space — 
the components of the vector ry transform according to the law 


E i 
Tu” Yag's > 
such that the square of the vector E remains invariant: 


2 


=x? tyl +z? 472 


TT 


re E 
Te r const 


ala 
For this the coefficients of the Lorentz transformation must obviously satisfy 
the requirement 


Yag Yar = Fen > 


often called the condition of orthogonality. Indeed, we have 


y MET ' PE O et) RE Wed hee ot) } 

ra Veg”) Yor?) = Yag Var” an = pagra = "p 
This requirement imposes a necessary restriction upon the coefficients of the 
transformation Ygg- In the particular case where the Lorentz transformation 
corresponds to a rotation in the plane (x, 7) with unchanged values of y and 

z, making use of (10.14) and (10.15) one can easily write for Yag 


-i- 


ae, 00 
2 2 
VU u 

EL UE 

z2 D 
(0) 10 0 

Veg = (11.1) 

0 omii 0 





250 THEORY OF RELATIVITY Ch. 1 


We generalize the definition of the vector rg to the case of an arbitrary 
4-vector. By definition, a set of quantities (components)a,, a,,,@,,a,, Which 
under a Lorentz transformation (i.e. a rotation of the axes in the four- 
dimensional space) transform according to the same law as the components of 
the radius vector rg, i.e. according to the-law 


= , 
ay = Yap lg > 


' is called a 4-vector ay. If one restricts oneself to rotations in the (x,7)-plane, 


then, making use-of the definition of Yag (or the analogy with (10.14) and 
(10.15)) one can write the transformation law in the form 


, AU 
Co Sa 
ee, (11.2) 
x 
v2 
| 
c2 
a, =a, a,=a,, (11.3) 
' ~U, 
a, +17 ay 
m Dame (11.4) 
T 
v2 
ASE 
c2 


For four-dimensional vectors, as for three-dimensional ones, one can intro- 
duce the idea of the scalar product 


cabs anota b. ; 


where c is a scalar quantity. The vectors ay and bg are called orthogonal if 
their scalar product is equal to zero. We do not consider other vectorial 
algebraic operations, since we shall not need them in what follows. 

An important characteristic of a 4-vector is the scalar corresponding to the 
square of the vector *, i.e. to the scalar product 


* Here and in what follows we make use of an abbreviated notation for summation. 
Repeated indices imply summation, for example 


x 2 PES) 
agba" È agba» Wy = “ala he 
Q 





§11 FOUR-DIMENSIONAL VECTORS AND TENSORS 251 


Di BD Dn, See De 
ax aa a, ta, tata, 


ala invar . (11.5) 


The invariance of až is seen directly from geometrical considerations: Lorentz 
transformations are a rotation in the four-dimensional space. The square ot a 
4-vector, like the square of the radius 4-vector Tas is not necessarily 
positive. If the square of the vector a2>0, then a, is called a space-like 
vector. The vector for which a <0 is called a time-like vector. 

We shall consider the determination of two important 4-vectors: the 4- 
vector of velocity and the 4-vector of acceleration. 

.We have to find a 4-vector of velocity in the form of the derivative of.the 
radius 4-vector with respect to a certain invariant — i.e. a scalar. The choice 
of this scalar is determined by the fact that, at small velocities v << c, the 
spatial components of the 4-vector of velocity must go over into the compo- 
nents of the ordinary velocity. 

Because of this, it is natural to define the 4-vector of velocity by the rela- 
tion 


Un lara (11.6) 


dx v. 
u, = —— = ——_., (11.7) 
x v2 v2 
Cel fh —=— la 
c2 c2? 
V. 
pa- HE (11.8) 
2 
VEL 
2 
V. 
(5= 4, (11.9) 
2 
T= 
DP 
u, = Wo C I : (11.10) 


252 THEORY OF RELATIVITY Ch. 1 


For v <<c the three spatial components of the velocity are the same as the 
components of the usual three-dimensional velocity. The fourth component 
u, Of the velocity is purely imaginary. Except for the factor (ic) it represents 
the coefficient of transition from the proper time dżọ to the time dr. 

An important feature of the 4-vector of velocity is the fact that its compo- 
nents are not independent of each other. Indeed, squaring the vector uas we 
find 


pie) 2 2 Pi 2 ` 
Ug =u, tuy tuz tur=—c 3 (11.11) 


Thus, the 4-vector of velocity is a time-like vector, and its absolute value is a 
constant. This property of the 4-vector is associated, of course, with the fact 
that the velocity of motion of material bodies cannot exceed the velocity of 


light. 
Now we define the four-dimensional acceleration w, as 
dug 


Expressing Wg in terms of the velocity v and the acceleration v, we find its 
components: 





du, dt 1 d Vy. 
w =— — = ——_4_ = 
dt dto /, wdr /)_v? 
c2 c2 
v v (v: Y) 
= + =, (11.13) 
CIA 
c2 c2 
v v,(V* y) 
yo a er (11.14) 
J 2-3 
c2 c2 
v v, (v: v) 
wni, (11.15) 
ERE 
c c? 
ace (11.16) 
2 2\2 
dto dt | /,_v? 2 (1-4) 
c? e? 


§11 FOUR-DIMENSIONAL VECTORS AND TENSORS 253 


A simple calculation shows that the square of the four-dimensional accelera- 
tion is equal to 


2—+ [o2v2-W-W)] j2 [zx if 
c2 = c 


2\3 2\ 3 
ay 3 
c? c? 


Thus the 4-acceleration is a space-like vector. 
Differentiating the equality (11.11) with respect to fg, we find 





wr= >O. (GUAT) 


dug 
To a wen cos (11.18) 
This last equality means that the vectors ug and wg are orthogonal o each 
other in the four-dimensional space. 

In addition to the definition of 4-vectors one can introduce 4-tensors. The 
set of quantities Ag, which transform as the product of two vectors dgbg, i.e. 


= + 

Aug = Tar VeuAru ? (11.19) 
is called a second-rank 4-tensor. The tensor A gp represents a set of sixteen 
quantities (a second-rank tensor in the three-dimensional space represents a 
set of nine quantities). As in the case of a 4-vector, the r-components of a 
4-tensor, i.e. the quantities Ag,, are purely imaginary, whereas all remaining 
components are real. Any tensor Ag, can be resolved into a symmetric part 
and an antisymmetric part (with respect to the permutation of subscripts). 
Writing 


=H i bi = 4s as 
Agg=27Aag tAgq) +7 Gag A gq) Aug tAag > 


it is easily seen that 
5 Sa /8 
Ang =Aga ; 


as = as 
Ais =—Aas 


Lorentz transformations leave this decomposition unchanged. 


i 254 THEORY OF RELATIVITY Ch. 1 


Purely antisymmetric tensors will play an important role in what follows. 
l From the condition of antisymmetry it follows that such tensors have six 
independent components and can be written in the form of a table: 


0 A, Azz Ax, 

Ais, = a) ° Ayz Ayr (11.20) 
Ax, Ay, 0 A,, 
A, A, Ae 


Making use of the properties of the quantities Yag (1 1.1) one can find the law 
of transformation of the components of a 4-tensor. We restrict ourselves to the 
case of an antisymmetric tensor and write one of the components in detail: 


A yy = Vex Vy pA ny = Vax VyyAxy * Ver MyyAry = 
, BL 
_Axy it Ay, 
oe 
2 


The other terms reduce to zero by virtue of the definition of yg, and Axe 
The law of transformation of the remaining components is easily found in 
the same way. As a result, for the transformation of the components of the 
antisymmetric tensor we obtain 


r . + 
Axy~izâyr 3 
A, = , AAE 
v2 
el fi (is 
c2 
, VU gt , AUTT 
AEA A,, tig Ay 


A, == A, 5: (11.21) 








§1l FOUR-DIMENSIONAL VECTORS AND TENSORS 255 


The invariants of 4-tensors, i.e. combinations remaining invariant under Lo- 
rentz transformations, are an important characteristic of 4-tensors. For anti- 
symmetric tensors of the second rank, which we shall need in what follows, 
the invariants are scalar quantities: 

1) the product Aaga (quadratic invariant), 

2) the product Ay, A Aut po (cubic invariant) , 

3) the product Ag, A uA pváva (biquadratic invariant). 

One can verify the invariance of these quantities by the direct substitution 
of the transformation law (11.19) and the values of the quantities Yg- Other 
scalar relations can either be expressed in terms of these three invariants or 
are equal to zero. Thus, for example, the scalar A gq, representing the sum of 
diagonal elements, is eaual to zero for antisymmetric tensors. 








Relativistic Mechanics 


§12. The dynamical equations of a material point 


| We now turn to the consideration of the dynamics of a material point in 
the theory of relativity. 

As in classical mechanics, a material point is understood to be a body 
whose dimensions can be neglected. Having physical applications in mind, we 
shall often speak not about the motion of a material point but about the 
motion of a particle. 

First of all we note that the Newtonian law of inertia is invariant under 
Lorentz transformations. Indeed, if a particle moves without acceleration in 
an inertial reference frame K, then under a linear transformation of the coor- 
dinates to another reference frame K’ the motion will remain without accel- 
eration. However, the equations of dynamics, which are invariant under 
Galilean transformations, possess no invariance properties under Lorentz 
transformations. 

In order to find the relativistically-invariant form of the equations of 
dynamics it is necessary to represent them in the form of a four-dimensional 
relation of the type of (10.9). 

The inertial properties of a body or a particle may be characterized by a 
certain scalar — the invariant mass or the rest mass m. The value of the rest 


mass is a constant characteristic of every kind of elementary particle. The 


256 


-S 


§12 DYNAMICAL EQUATIONS OF A MATERIAL POINT 257 
4-momentum pg of a particle is defined as follows: 
Pq = Mug- (12.1) 


For the components we have: 


Qo 5 had — 
v2 v2 
Nr os 
$ G (12.2) 
mv, icm 
piz , = 
"2 T 2 
ES == 
c2 c2? 


In the limiting case v << c the three spatial components of the momentum go 
over into the ordinary components of the particle momentum: 


Py =mv,, p =mvy, p,=mv,. 


A natural relativistic generalization of the equations of Newtonian dynam- 
ics are the equations 


dpa d 


dren dig Chant (12.3) 


where Fy is a four-dimensional vector called the four-dimensional force or 
Minkowski force, and & runs through the values x, y, z, T. The relativistically- 
invariant character of the eqs. (12.3) follows directly from what was said in 
§11: the right-hand and left-hand side of (12.3) involve four-dimensional 
vectors, which change in the same way under a -four-dimensional rotation 
(Lorentz transformation), and the scalar m which does not change at all. 

In the following we shall call the relations (12.3) the equations of rela- 
tivistic dynamics. 

We write the components of these equations. We have: 


— = — ———_ =F, , 
c2 c2 


or 





| f 258 RELATIVISTIC MECHANICS Ch. 2 


} d mv, v2 
t — —*— =F, V1-—. (12.4) 
| dt /, v2 c? 
| = 
c2 


I For v <&c eq. (12.4) must go over into the ordinary Newtonian equation. 

The left-hand side of formula (12.4) is the derivative of the momentum 
with respect to ordinary time. We require that the right-hand side of eq. 
(12.4) should be the component F, of the ordinary force. Consequently, the 
component F, of the 4-force is connected with that of the ordinary force F, 
of classical mechanics by the relation 


v2 4 
F, Ve (12.5) 


Then formula (12.4) is written in the form 


—> =F. (12.6) 


For v << c formula (12.6) reduces to the Newtonian equation. 
Analogous relations can be written for the two remaining spatial compo- 


nents: 
d mvy 
= ao, 5 (12.7) 
2 
dt 1 ne 
c2 
d mv 
Baas rig = =F, } (12.8) 
dt 1 eve 
c2 


Now we write the fourth component of the relation (12.3). Making use of 
(11.10), we find 


! A ien a ee (12.9) 
dt h c? 
c2 


To find the physical meaning of the component F, of the 4-force we mul- 
tiply (12.3) by ug and sum over all components a (a= x, y, Z, T). 


N 


§12 DYNAMICAL EQUATIONS OF A MATERIAL POINT 259 


Obviously, by virtue of (11.18) we have 


dmug 
u =F,u,="0, 


a dto a“ a 
or 


F uy uaus CR u + Fu =0. 





Substituting the values of F}, Fy, Ip es Uy, Uz and u,, we have 
v 
F Š +F, 7 + 
TE 
\ c? c? 
v, ic 
EF TR =0 
ae 
2 2 


From which it follows that 
2 . 

ENIE SEG? KY, 
4 ce c 


The right-hand side of this equation contains the work done by the force 
F on the particle per unit time. Thus, the component F, of the 4-force turns 
out to be connected with the work done by the three-dimensional force ¥: 


EFRY 
F, =+ —— (12.10) 
c T 
c2 
Making use of (12.10) we write (12.9) in the form 
2 
EC ae (12.11) 


2 

dt A _ uF 

c2 

On the right-hand side of (12.11) the product F -v gives the work done by the 


force on the particle in unit time. Consequently, on the left-hand side of the 
equation we must have the change of the energy in unit time. 





260 RELATIVISTIC MECHANICS Ch. 2 


Thus, we define the total energy of a particle as follows: 


2 
fos (12.12) 


ee 
c2 
Finally, we find the expression for the acceleration. The equations of 
motion can be written in the form 


d my = m dy wv a gee a 
2 
dt hi ve m dt i 
c? c2 
or 
oN (12.13) 


jaune dt c? dt 
c2 


By means of (12.11) one can rewrite (12.13) in the form 


VRTE 


2 
ee = c lz -»| (12.14) 
deo c2 





§13. Momentum, energy and mass in relativistic mechanics 


We shall now discuss the properties of the mechanical quantities intro- 
duced in the preceding paragraph, i.e. the rest mass, momentum and energy. 

According to (12.2) the connection between the rest mass m and the, 
three-dimensional momentum p is determined by the relation 


p= NYE (13.1) 
which for v << c is the same as the usual expression of classical mechanics. 


Hence one would think that it could be taken that the scalar m is the same 
as the mass of a body moving with a small velocity. 





§13 MOMENTUM, ENERGY AND MASS 261 


However, as will be seen from what follows, the properties of the invariant 
mass mm (the rest mass) differ essentially from those which are attributed to 
mass in classical mechanics. Namely, the rest mass does not satisfy the conser- 
vation law. There are physical processes in which the rest mass of the particles 
before the beginning of the process is not equal to that of the particles re- 
maining after the end of the process. 

Examples of phenomena of this kind will be given below. The singularity 
(from the point of view of the concept of mass in classical mechanics) of the 
rest mass m and the possibility of its non-conservation are particularly clear 
from the fact that the presence of rest mass is not a necessary property of 
particles. 

For example, the existence in nature of elementary particles whose rest 
mass is equal to zero is beyond any doubt. Light quanta (photons) are such 
particles. According to all available experimental (see §16) and theoretical 
data, the neutrino is also a particle with zero rest mass. The neutrino is a 
neutral particle playing an important role in nuclear processes (it arises in 
particular in B-decay). 

It is clear that, if in the processes of mutual transformation, which play a 
most important role in the world of elementary particles, there are transitions 
of particles with non-zero rest mass into particles with zero rest mass, then 
the rest mass is not conserved. Such processes indeed occur in nature, and 
some of them are well investigated. Examples will be presented in §16. 

In spite of its unusual properties, the rest mass is a very important charac- 
teristic of bodies. 

Every elementary particle has a well defined value of rest mass (including 
also the value zero), which does not change from sample to sample..Hence 
the rest mass is a fundamental characteristic of an elementary particle. 

One can speak of the rest mass of a body consisting of many elementary 
particles in the same way as of the rest mass of an elementary particle. If the 
dimensions of the body are neglected, it can be considered as a material point 
with a rest mass m. The question as to how the rest mass of the body is con- 
nected with the rest masses of the constituent particles will be discussed 
below. 

In addition to the rest mass, a mass m(v), called the relativistic mass or 
simply mass and defined as the factor of proportionality between the vectors 
pand v, is often introduced: 


p=m(v)v , (13.2) 


where 


262 RELATIVISTIC MECHANICS Ch. 2 


mh) = nm, (13.3) 
1 


c2 


The relativistic mass depends on the velocity and is hence a function not only 
of the properties of the particle but also of the state of its motion. 

However, it should be stressed that the relativistic mass m(v) is not a rela- 
tivistically-invariant quantity. Indeed, the quantity vV 1 —v2/c2 is not invari- 
ant (the law of transformation of y 1—v2/c2 is found most simply from 
(9.4), taking into account that dfp is an invariant and that the law of trans- 
formation of the time dt is known). 

If a particle moves with different velocities relative to two reference 
frames, then its mass, measured by devices located in these reference frames, 
will be different. 

The energy of a particle was determined by the relation (12.12). Before 
going on to a discussion of this relation, we note that the following important 
relation, connecting the time component of the momentum 4-vector and the 
energy, results from (12.12) and (12.2): 


p,=ik/c. (13.4) 


Thus, p, is the same as the energy of the particle apart from the constant 
factor. The importance of (13.4) lies in the fact that by means of it the 
momentum and energy may be combined into one 4-vector, called the 
energy-momentum 4-vector, 


Pa = (Px Py» Pz» Í Ele) - (13.5) 


The components of the. energy-momentum 4-vector are not relativistically- 
invariant quantities. The three-dimensional momentum as well as the energy 
turn out to be relative quantities. In transition from one inertial reference 
frame to another the components of the energy-momentum 4-vector trans- 
form according to formulae (11.1)—(11.4), i.e. 





§13 MOMENTUM, ENERGY AND MASS 263 





Py =P, (13.7) 
i he (13.8) 
pc £E'+up, 
E= =—_—., (13.9) 
1 2 
Y 
c2 


These relations show that under Lorentz transformations the energy and the 
components of the momentum are expressed in terms of each other. The 
energy and momentum are not separately invariant quantities, but the square 
of the 4-vectors, i.e. the quantity 
2 GY 4 29) in 7 a ney OP ES Hy 17 po ee 

Pa PRD orn) PHBS “P E*/e* =invar. (13.10) 
Substituting the values of the energy (12.12) and the components of the 
momentum (12.2) into (13.10), one can easily calculate the value of this 
invariant, which turns out to be 

=p. +p, tp, Bofe- mci. (13.11) 
Thus, the energy-momentum 4-vector is a time-like vector. 


We now return to the determination of the energy (12.12) of a particle. 
For v << c formula (12.12) goes over into 


2 
E=me? h T ~me? + +m? (13.12) 
2c? 


The second term is the same as the kinetic energy of a particle in classical 
mechanics. However, for v = 0 the energy of the particle outside a force field 

Ey = me? (13.13) 
turns out to be different from zero. 

At first sight it may seem that such a determination of the energy is 
arbitrary. Since the energy is found from the differential relation (12.11), it 
can be chosen to be 

mc? ` 
E=—— +tconst. (13.14) 


2 
tee 


c2 








264 RELATIVISTIC MECHANICS Ch. 2 


If the arbitrary constant is chosen to be equal to (—mc*), then the energy 
determined in such a way at v<<c will be the same as the energy of the 
particle in classical mechanics. 

In fact, however, it is easy to show that the constant must be set equal to 
zero, as was done in (12.12). We cannot require in advance that, without 
exception, all quantities of relativistic mechanics should assume the classical 
form for v << c. However, in any case it is beyond any doubt that for v << c 
the Lorentz transformations must be the same as the Galilean transformations 
and, consequently, the ordinary law of addition of velocities must hold. In 
order that the transformation (13.6) of the momentum for v << c may go 
over into the theorem of addition of velocities of classical mechanics, the 
condition E’ > mc? must be fulfilled. Then from (13.6) it follows that 


t 
=p + 
Py =P, +m, 
or 


' 
= +v. 
vU. GUY v 


If the energy were determined by formula (13.14) and not by (12.12) and if 
for v << c it tended to the limit E’ > 0, then the Lorentz transformation for 
the velocity would not reduce to the formula of addition of velocities of 
classical mechanics. 

Thus, the theory of relativity leads to a new, very important conclusion: 
the energy of a particle at rest is equal to mc?. It is natural to call the quan- 
tity mc? the rest energy. Every particle possessing a rest mass m has also a 
rest energy mc?. 

The energy of a moving particle can be connected with the relativistic 
mass by the relation 


E=m(v)c?, (13.15) 


analogous to (13.13). 

Formulae (13.13) and (13.15), often called Einstein’s formulae, show that 
every particle possessing a mass m has, at the same time, an energy E. The 
energy and mass are indissolubly connected with each other and are propor- 
tional to each other. This statement is often called the principle of equiy- 
alence of mass and energy. 

Of course, the notions of equivalence and identity should not be confused. 
The energy and mass are different physical characteristics. of particles, and 
the principle of equivalence established only their proportionality to each 





§13 MOMENTUM, ENERGY AND MASS 265 


other. The relation between the mass and energy is similar to the relation 
between the gravitational and inertial mass in classical mechanics: the two 
masses are indissolubly connected with each other and proportional to each 
other, but are at the same time different characteristics. * 

At present Einstein’s formula (13.15) as well as formulae (13.2) and (13.3) 
are confirmed by a vast amount of experimental material and their validity is 
beyond doubt. 

If the particle is not acted upon by any external forces, then the energy 
conservation law 


dE 


aa l or E=const, (13.16) 


and the momentum conservation law 


dp 
y0 (13.17) 


hold. From Einstein’s formula it follows that with the energy conservation 
law the law of conservation of the relativistic mass 


m(v) = const (13.18) 


automatically holds. 

As distinct from classical physics, where there are two independent con- 
servation laws (i.e. the energy conservation law and the mass conservation 
law) there is in the theory of relativity only one conservation law, the law of 
conservation of energy or relativistic mass. 

We shall come back to the physical interpretation of the energy conserva- 
tion law in the theory of relativity in §15—17. 

In conclusion we note that Æ is usually called the total energy. This should 
not lead to misunderstanding: E does not include the potential energy of the 
particle in an external field, if such a field acts on the particle. Sometimes 


one introduces the kinetic energy Eķin defining it as the energy of motion of 
the particle, i.e. 





= [m(v)-m]c?. (13.19) 


2 
fy 


c2 


* For more detail see: V.A.Fock, The theory of space, time and gravitation (Perga- 
mon, London, 1964). 





266 RELATIVISTIC MECHANICS Ch. 2 


We introduce one more useful formula, connecting the energy and the 
three-dimensional momentum. From (12.12) and (13.1) it follows directly 
that 


p=Ev/c? . (13.20) 


By means of formula (13.13) one can find the expression for the classical 
radius of electron ro, introduced in Part I. 

We have already discussed in §18 of Part | the fundamental difficulty of 
classical field theory associated with the problem of the self energy of the 
electron. For the energy of the electron in its own field one obtains the 
expression 


U=e/ry, 


which diverges as rg > 0. 
If it is assumed that the overall mass of the electron is associated with the 


energy of the field produced by it, i.e. if 


U ~ me? 


for the electron at rest, then for 7g one obtains the value 
Ry ~ e*/mc” A 


which is the same as the inequality (29.6) of Part I. However, it should be 
borne in mind that the problem of the self energy is not solved at all by 
introducing the radius of the electron. 

We have stressed that in the theory of relativity no body whatever, includ- 
ing the electron, can be considered as a perfectly rigid ball of given radius. 
Hence the quantity ro cannot be interpreted as the true “radius” of the elec- 
tron. It is the minimum dimension of the space region in which use can still 
be made of the relations of classical field theory, the limit of applicability of 
its concepts. We note that, as has already been said in §29 of Part I, the true 
limit of applicability of classical field theory sets in at considerably larger 
distances. 





§14 LAGRANGE’S EQUATIONS 267 
§ 14. Lagrange’s equations; the Lagrangian and Hamiltonian 


As in classical mechanics, equations of motion can be written in general- 
ized coordinates in the form of Lagrange’s equations. For this we must first 
of all find the Lagrangian. 

By definition the Lagrangian is a quantity whose derivatives with respect 
to the components of velocity represent the components of the momentum, 
while its derivatives with respect to coordinates represent the components of 
the force. 

Hence we have 


ðL mvj 
— =p,=——-. (14.1) 
du; 1 v2 

TÆ 
ðL ðU field 
EG ape ee 2 
x; F, Gia a ee) 


where Us;.)4 is the potential energy of the external field, depending only on 
the coordinates of the particle. Eqs. (14.1) and (14.2) are satisfied by the 


Lagrangian: 
L=- 2 1 E —U 14.3 
=—me~vV z2 field (14.3) 


We see that in relativistic mechanics the .Lagrangian no longer represents the 
difference between the kinetic and potential energy. 

By means of the Lagrangian one can write Lagrange’s equations in gener- 
alized coordinates q;: 


dab big Gas 
dt ðq; ðq; 
or 
d ðL 
wS a -” 
where 
_ ol 
Lirie 
ðq; 


are generalized momenta. 





268 RELATIVISTIC MECHANICS Ch. 2 


Knowing the Lagrangian one can then find the Hamiltonian H. If the 
Lagrangian does not depend explicitly on time, then, by definition, 


2 2 
_ mv 2 v z 
H= 27 4q;p,—L = EEE NE T = 
v2 c2 
Jee 
z2 
2 
mc 
TAr ee (14.5) 
v2 
YN Lx 
D 


The velocity can be expressed in terms of the momentum by means of the 
relations (12.2), from which it follows that 


22 
2 


ke 
c2 


PD — Ue 





pitp2+p2=p 


A simple calculation gives 


H=/p2c? + m2c4 + Uʻfield - (14.6) 


Formula (14.6) shows that in relativistic mechanics the Hamiltonian is just 
the total enèrgy expressed in terms of the momentum of the particle. 

For large values of the momentum, when p >> mc, the expression for the 
Hamiltonian of a free particle attains a simple form: 


Hxpc. (14.7) 


A motion with such a large momentum, for which the approximate formula 
(14.7) is valid, is called an ultrarelativistic motion. It is clear that for particles 
with zero rest mass formula (14.7) is valid. We shall often make use of it in 
considering processes in which light quanta (photons) take part. 


§15. The mechanics of a system of particles in the theory of relativity 


Up to now we have restricted ourselves to the consideration of one par- 
ticle. The formulation of the mechanics of a system of particles in the theory 
of relativity is a much more complex problem. Nevertheless, in this case also a 
number of important general laws can be established. 





§15 MECHANICS OF A SYSTEM OF PARTICLES 269 


A system of particles as a whole can be characterized by its energy E, 
momentum P and rest mass M. If we are interested in the motion of the sys- 
tem as a whole, then, disregarding internal processes in the system and its 
spatial extension, the system can be considered as one material point. For the 
system as a whole one can write the equality 


E=Mc2, (15.1) 


where M is the rest mass of the entire system. Since it is always true that 
M> 0, the energy of a system of particles, like the energy of a single free 
particle, is an essentially positive quantity. 

However, in general it is impossible to find expressions for the energy and 
momentum of the system in terms of the corresponding quantities for its 
individual particles, or to find general relations between the energy and mo- 
mentum. The interaction existing between the particles can lead, for example, 
to the dependence of £ on time (it should be recalled that E denotes the sum 
of the rest energy and kinetic energy, but does not include the energy of 
interaction). Hence the actual construction of the mechanics of a system of 
particles is restricted to a relatively few simple cases. These are: 

1) a system of non-interacting particles, 

2) a system of particles which are at large distances from one another and 
are moving with very large velocities, 

3) systems of particles with a weak electromagnetic interaction. 

The last case will be considered later, in §25. 

In a system of non-interacting particles the energy and momentum possess 
additive properties, so that J 


N N mjc? 
DS Tai D ame (15.2) 

i=l į=1 U; 

ae) 

(2 

N N miv; 
P=)) p= 2———. (15.3) 

i=1 isl of v? 

ai 


where N is the number of particles in the system, and the index į refers to 
every one of the particles. The velocities of all particles are constant and, con- 


sequently, the total energy and total momentum of the system are also con- 
stant in time. 


It is easily seen- that the quantities E and P form a 4-vector of energy and 


a ee eee ee 


270 RELATIVISTIC MECHANICS Ch. 2 


momentum. Indeed, one can write 


N mjc? 
E= ———=const, (15.4) 
i=1 v? 
i= 
c2 
N  m;v; 
P=) ——— = const . (15.5) 
i=l U; 
| a 
c2 
Hence the 4-vector 
N P 
Fon = D (15.6) 


i=1 


can be introduced. Every term of the sum po is the 4-vector of the energy 
and momentum of an individual particle. 

Before passing on to the discussion of the consequences of these proposi- 
tions, we shall show that in the very important second case cited above the 
properties of a system of interacting particles can be reduced to those of non- 
interacting particles. 

Consider a system of particles moving at large distances from one another. 
By the assumption that the velocities of the particles are very large, of the 
order of magnitude of the velocity of light, the energy 

mjc? 


I v 

L 
is in general large in comparison with the interaction energy at large distances. 
Hence it can be assumed that, as for free particles, the total energy and 
momentum of the system are expressed by formulae (15.4) and (15.5). 

When particles approach each other (this situation is often called a colli- 
sion) the interaction between the particles can become appreciable and for- 
mulae (15.4) and (15.5) lose their applicability. However, when particles 
again diverge to large distances after collision the formulae (15.4) and (15.5) 
are again applicable. It is obvious, moreover, that the total energy and momen- 
tum of the particles which have diverged cannot differ from the energy and 
momentum before the interaction. Hence the conservation laws can be writ- 
ten in the form 











§15 MECHANICS OF A SYSTEM OF PARTICLES 271 
N 2 Ne 2 
m;c mye 

SD : (15.7) 

= 2 = 2 

i=l u? kel vp 

Peet po 

Cc c2 


N ; Vv; = e - 
2- D a (15.8) 


Here the indices i and k refer to particles before and after the interaction. The 
asterisk in the sums on the right underlines the fact that the number of par- 
ticles in the system before and after the interaction can be different (see 
§17). 

The quanuties Æ and P, referring to particles before and after the inter- 
action, form a 4-vector of energy and momentum. According to (13.11), one 
can write the invariant for this 4-vector: 


1=P2=)> (Spe)? = invar = —me2 (15.9) 
a i 


The constancy of the invariant / expresses the general energy-momentum 
conservation law. 
We transform this invariant to a form convenient for practical use. We have 


D (Drp) Heyy? +2 E pp = 


a ae 


= Zi)" - DnD 22D) = 


k<i cA 





Here we have carried out the summation over the index a of the 4-vector and 
made use of formula (13.4) for ps. Regrouping the terms in the sum, we 
find 


£ (Do) -3 (Dap -F-m (15.10) 


i 


en 


——— rn atna 


272 RELATIVISTIC MECHANICS Ch. 2 


Thus, we can write 
2 *)2 
p Eep- EF) (15.11) 
5 c2 


where the values of the quantities after the interaction are denoted by an 
asterisk. The components of the 4-vector P y transform, in the transition from 
one inertial reference frame to another, according to the general formulae 
(13.6)—(13.9). 

It often turns out to be convenient to make use of a reference frame in 
which -the total momentum is equal to zero. Such a reference frame, as in 
classical mechanics, is called the centre of mass system K(¢-™), 

Let a reference frame K be given in which a system of particles has a 
momentum P and an energy £. Let us find the velocity of motion V(°-™) of 
the centre of mass of the system with respect to the reference frame K. The 
velocity V‘-™) characterizes the motion of the system as a whole. For sim- 
plicity we assume that the velocity of the centre of mass of the system is 
directed along the x-axis. Then according to the inverse of (13.6)—(13.9) we 
have È 

p3 — ple.m) a 


pem = Q = —__*_ (15.12) 


|, CSD 


c2 
(c.m)~q= (c.m) =- 0= 
l) 0 P, 9 Py 0 T A (15.13) 


E- yc-m)p 
BO) oH (15.14) 


/, a (vfe-m))2 


c2 


Quantities referred to the centre of mass system are denoted by the super- 
script c.m. 
From (15.12) we obtain 


Pc? 
AED) cae (15.15) 


In the general case of an arbitrary orientation of the velocity vector of the 
centre of mass system, instead of (15.15) one obtains 


lt (poe 


IE 


‘|. 1 


ee 








eee 


§15 MECHANICS OF A SYSTEM OF PARTICLES 273 
2 
vcm) = Per ‘ (15.16) 


In contrast to classical mechanics, the velocity of the centre of mass in 
relativistic mechanics cannot be represented as the derivative of the coor- 
dinate of the centre of mass with respect to time: 


yíc-m) + da Room) k 
dt 


The quantity Pc2/E cannot, in general, be expressed in the form of the deriva- 
tive of any quantity with respect to time. Hence it is impossible in relativistic 
mechanics to introduce the notion of the coordinate of the centre of mass 
for an arbitrary system of interacting particles. In §25 it will be shown that 
in the case of a system of weakly interacting particles use can be made, ina 
certain approximation, of the concept of the centre of mass. 


We shall also discuss some properties of the rest mass of a system of par- 
ticles. 


Consider a centre of mass system K’. We write the invariant of the 4-vector 
of energy and momentum (15.10) for this system in the form 


(PO™))? 2 — Em)? = _M2¢4 , (15.17) 
Since P(¢-™) = 0, formula (15.17) in the system K' assumes the form 
E-m) = Me? , (15.18) 


where M is the rest mass. 
On the other hand, according to the general formula (15.2), 


N 
glem) = » 


3” 
gy Ani 


c2 


mjc? 


where v; is the velocity of the ith particle referred to the centre of mass sys- 
tem. Then for the rest mass we obtain 

NEE 

M= De pol 

i=1 


(15.19) 


ay 
2 


274 RELATIVISTIC MECHANICS Ch. 2 


We see that the rest mass of a system of particles is not equal to the sum of 
the rest masses of individual particles mM, but depends on the velocities of 


their motion with respect to the centre of mass system. 

Consider, for example, the case where a system of particles represents an 
ideal gas. The ideal gas obviously satisfies the requirements regarding the char- 
acter of the interaction between the particles about which we have spoken 
earlier. Then, according to (15.17), the rest mass M of the gas depends on 
the internal motion of the gas particles or, what is the same, on the tempera- 
ture of the gas. 

The difference between the rest mass M of the system and the sum yD mi 


of the rest masses of the particles forming the system is of great significance 
for processes occurring in nature. We shall discuss this in the next paragraph. 

In the general case of a system of particles with an arbitrary interaction 
formula (15.2) is not an expression for the total energy of the system; it gives 
only the sum of the rest masses and-kinetic energies of the particles. 

It is necessary to include the energy of interaction between the particles in 
the total energy. It turns out, however, that in relativistic mechanics the idea 
of a potential energy of interaction of a system of particles does not exist. 
The potential energy of interaction of particles must depend only on their 
position. If the position of any particle is changed, then the potential energy 
of the system of particles and the forces acting on individual particles must 
change instantly. In other words, the concept of a potential energy of inter- 
action of a particle is associated with the concept of action at a distance and 
cannot be admitted in the theory of relativity. In general, it appears to be 
impossible to write an expression for the energy of a system of interacting 
particles. The same holds also for the momentum of the system, which in the 

" theory of relativity is not a time-independent quantity. 

Besides systems interacting via collisions, an approximate expression can 
be found for the interaction of charged particles in the special theory of 
relativity. This will be done in §25. The gravitational interaction of bodies is 
considered in the general theory of relativity the subject of which is beyond 
the scope of our book. 


§ 16. The energy-momentum conservation law in nuclear physics 


The energy-momentum conservation law and the mass-energy relation have 
not only-found experimental confirmation, but have become basic proposi- 
tions in contemporary nuclear physics. As far back as the first work of Ein- 











§16° ENERGY-MOMENTUM CONSERVATION LAW 275 


stein it was pointed out that the mass-energy relation could be checked 
experimentally in investigating the phenomena of radioactivity. Indeed, a 
characteristic feature of radioactive decay as, after all, of all other nuclear 
processes, is a large change in the energy of the system and the high energies 
of the nuclear particles produced. Of the whole variety of nuclear processes in 
which relativistic effects play an essential role, we have to confine ourselves 
within the framework of this book to only the most essential or typical ones. 

The purpose of the examples presented below is to show that the relations 
of relativistic mechanics are a necessary basis for a study of processes taking 
place with atomic nuclei and elementary particles. 

1. Reactions involving the decay of particles. A number of basic processes 
occurring with atoniic nuclei and elementary particles arise in fusion reactions 
and the decay of particles. The energy-momentum conservation law imposes 
a fundamental restriction on the possible reactions. Consider the decay of 
one particle or body into two others. We shall assume that the decay takes 
place spontaneously, i.e. as a result of internal changes in the system, without 
the action of external forces on the system. 

Let the disintegrating particle have a mass M. In the centre of mass system 
the momentum before the decay is zero. The invariant of the 4-vector of 
energy and momentum is equal to 


[= P?¢2 _ E? = _mM2¢4 
After the decay two particles arise with masses m} , 1, momenta pj, Pz and 
energies £} , E2. 
In the centre of mass system the total momentum after the decay is zero: 
P; +P =0. 
The invariant / can be written in the form 
=, 232) Ph 2.4 
I=(p, + p)°c (E + £4)" =—Mee" , 
or 
Mc* =E, +E 
1 2° 
Writing the energies £} and Æ, in the form 


Eye mer +E, 


Tees 2 
E, = m,e ER > 


N 


276 RELATIVISTIC MECHANICS Ch. 


we have 


Mc? = (m, + mz)c7 + ED + EO). (16.1) 
Since the kinetic energies of the particles produced after the decay are each 
greater than zero, it follows from (16.1) that the spontaneous decay of a 
body is possible only if the inequality 


M>m +m, (16.2) 


is fulfilled, i.e. only if the mass of the body is larger than the sum of the rest 
masses of the decay products. Conversely, in the cases when the mass of the 
body is smaller than the sum of the masses of the decay products, a spon- 
taneous decay of the body is impossible. For the decay to occur in this case it 
is necessary to supply energy from outsice: 

Making use of the equalities £} = ypic* + mic and E, = Vp 2o? F msc* 
and taking into account that pj = By one can ais find the energy of the 
particles produced in the decay. 

Namely, we have 


E? = pc? + m3c* = = E? + m3c4 - mic4 


On the other hand, 
E? = (Mc? — E,)} , 
hence 


(m? -m +m?)e? (m? -m + m3)c? 
s 2M ba ER 2M 


2. Stability of atomic nuclei. The general results obtained above allow one 
to clarify the very important problem of stability of atomic nuclei. 

Consider an atomic nucleus consisting of Z protons and A—Z neutrons 
(where Z is the atomic number, and A is the mass number) and having a mass 
M. The protons and neutrons in the nucleus possess considerable kinetic 
energies. liuweyer, strong attractive forces — nuclear forces — acting between 
them ensure the stability of the system as a whole. The rest energy of the 
nucleus Mc? is made up of the rest energy of all particles constituting it, 


ye m;c?, and the energy of internal motion and interaction of the particles. 


§16 ENERGY-MOMENTUM CONSERVATION LAW 277 


In order for a nucleus to be stable and that the motion of nuclear particles 
should not lead to its spontaneous decay, it is obviously necessary that, the 
inequality P 5 

Mc* < DD mjc (16.3) 


be fulfilled. The quantity 
Ame? = 2} mjc? -Me , (16.4) 


called the nuclear binding energy, is a measure of the stability of the nucleus. 
If, in particular, Amc? is negative, the nucleus is unstable and decays spon- 
taneously. 

In addition to the binding energy, the quantity 


Am=))m,-M, (16.5) 


called the mass defect, serves as a measure of the stability of the nucleus. In 
order for a nucleus to be stable it is necessary that the mass defect be positive. 
If the mass defect of a nucleus is positive, then according to (16.1) the 
nucleus is stable with respect to decay into its constituent particles — protons 
and neutrons. However, this does not mean that the nucleus is absolutely 
stable and that it can exist for an indefinitely long time. 

Assume that as a result of a decay of a nucleus, for example according to 
the scheme 


M->M 1+, 
nuclei with masses M} and M, are produced. Such a decay is in principle pos- 
sible if the atomic numbers Z} and Zz and mass numbers A, and A, of the 
product nuclei satisfy the equalities 

Z,+Z,=Z, A,+A,=A. 
Let Am, and Am, be the mass defects of the product nuclei. If the mass 
defect Am of the initial nucleus is smaller than the sum of the mass defects 
of the product nuclei, i.e. 


Am <(Am, + Am,) ‘ 


then the system arising after the decay possesses a higher stability than the 





| 278 RELATIVISTIC MECHANICS Ch. 2 


| initial system. Hence when Am < (Am; + Am3) the nucleus, which is stable 
. with respect to the decay into individual elementary particles, is not stable 
i} with respect to the decay into two parts. As a result of transformations of the 
7] nuclear particles, a decay configuration will arise in the nucleus after the 
$ lapse of a certain time interval. The initial nucleus then decays into two 
nuclei with masses M, and M3. 

As an example the nucleus 8Be can be considered. This nucleus has a mass 
M = 8.005 31 amu, which is aller than the sum of the masses of four pro- 
tons and four neutrons: Dm = FAL: 007 825 amu + 4X 1.008 665 amu = 
8.065 960 amu. Hence the nucleus SBe is stable with respect to decay into 
individual protons and granons; However, the mass of gBe is larger than the 
mass of two gucia of 3He: 2Mye = 2X4.002 603 amu = 8.005 206 amu. 
Hence the nucleus SBe is unstable and must decay spontaneously into two 
a-particles, which indeed occurs. 

On the contrary, the mass defect of the nucleus 2Be is not only positive 
but also exceeds the sum of the mass defects of all nuclei into which it could 
decay. Hence the nucleus 2Be is absolutely stable. 

Knowing the masses of all isotopes one can easily determine their stability 
from the values of the mass defects. 

For details we refer the readers to more specialized texts. 

3. Energy yield of nuclear reactions. The application of the energy conser- 
vation law to a nuclear reaction of the type 


A+B>C+D, 


where A and B are the initial nuclei, and C and D are the reaction products, 
allows one to find the energy yield of the reaction 


E (M; + Mp) -Me +Mp)]c? > (16.6) 


if the masses of all the nuclei are known. 
As an important example from which the relations of the theory of rela- 
tivity can be checked in a particularly obvious way, we consider the reaction 


JLi + }H > 23He. 


The masses of-all the nuclei figuring in the reaction have been measured with 
a high degree of accuracy. 


M (3Li) = 7.016 004 amu, 
M(H) = 1.007 825 amu. 


eS LL Cc lc ;wlLUvheS 


§16 ENERGY-MOMENTUM CONSERVATION LAW 279 


The total initial mass is 8.023829 amu. The total final mass is 2M GHe) = 
2X 4.002 603 = 8.005 206 amu. As a result of the reaction the rest mass of the 
particles is reduced by 


Am = 0.018 623 amu. 


The corresponding energy, representing the kinetic energy of the two a-par- 
ticles, must be equal to (1 amu = 931.48 MeV) 


E= Amc? = 17.3 MeV, 


which agrees to a high degree of accuracy with the measured values of the 
energy. 

In this example it is seen that the rest mass of the particles is not con- 
served. 

In the course of the reaction a rest mass equal to Am vanishes. However 
the energy conservation law 


=2 
ELi t Ey = 2Eye 
must hold and also the relativistic mass conservation law 
m, (v) + my (v) = 2m) - 


The above reaction is only one example of nuclear reactions in which the rest 
mass conservation law is not fulfilled. 

4. Deccy of elementary particles. A number of elementary particles turn 
out to be unstable with respect to decay. It is impossible to discuss here all 
the known cases of decay, and we can dwell only on some of them, illustrat- 
ing in an obvious way the importance of the application of conservation laws 
to the analysis of decay. 

As an example we shall consider the decay of charged and neutral mesons. 
As is well known, the existence of three kinds of m-mesons (pions) has been 
established: 7* with a positive charge, 7 with a negative charge and 79, the 
neutral meson. The charged mesons have masses of 273m,, and the neutral 
meson has a mass of 264m,. The charge of the charged mesons is equal in its 
absolute value to that of an electron. In addition to m-mesons, two kinds of 
-mesons (muons) have been discovered: the positive u* and the negative u— 
with the same absolute value of the charge and a mass equal to 207m,- 

It turns out that both z-mesons and p-mesons are unstable. The lifetime of 


EE ts—“—_ 


280 RELATIVISTIC MECHANICS Ch. 2 


charged m*-mesons is 2.6 X 10-8 sec, and that of the neutral 79-meson is 
about 10-15 — 10-16 sec. As mentioned earlier, u*-mesons have a lifetime of 
2X 1076 sec. 

It is clear that the establishment of the decay scheme of all these particles 
is of fundamental importance for contemporary physics. 

We begin with the decay of charged m-mesons. A charged 7*-meson decays 
intoa p*-meson and a neutrino v: 


+ + 
Nenad lel +10 


The neutrino is a neutral particle with a very small rest mass, which, in con- 
trast to photons, does not cause any ionization of atoms in its passage 
through matter. The study of the above reaction allows one to estimate most 
accurately the rest mass of the neutrino. * 

That is to say, knowing the rest masses of 7-mesons and p-mesons one can 
find the kinetic energy of the -meson Eğin as a function of the rest mass of 
the neutrino. 

In the centre of mass system the momentum of the n-meson before decay 
is equal to zero. After decay the total momentum remains equal to zero, so 
that the momenta-of the y-meson and neutrino are of the same magnitude 
Py =P, =P and of opposite direction, (Pp, + Pp) = 0. The law of conservation 
of the energy-momentum 4-vector reduces to the energy conservation law 


ZBE RE”, 


ma c? = (Eğin +m P +/p2c? + mace , 
or, since in the centre of mass system p, = Pp» We have 


= (Eğin tm, c?) + Vp? c 2 + mc 4 (16.7) 


We make use of the identity 


or 


DS H/EIP De 24 


Beet 1710 pic’ +m*c 


kin 
or 


E2 , + mc Ey; = pec? , (16.8) 


kin 


* For the properties of the neutrino see § 123 of Part V. 


§16 ENERGY-MOMENTUM CONSERVATION LAW 281 


and rewrite (16.7) in the form 
2_(re 2S. a2 2 pu 2-4 
mc” — (Ekin t m,e”) = V (Ekin) t 2m c Ekin t mye 


Squaring the above formula we find for the energy of the -meson (referred 
to the centre of mass system) produced in the decay of a 7-meson 


2 2)2 2.4 
_(n,c —m,c Ja mc 


Bi o ee a L ‘ 
kin 2m,c2 (16.9) 
The kinetic energy, Ekin? of the u-meson may be measured from the 
ionization caused by it. It turns out always to have a unique value. From the 
measured value of Ekin it follows that, within the limits of experimental 
accuracy, m, = 0. Moreover, the fixed value of Ekin points out the correct- 
ness of the assumed decay scheme. If two or more neutral particles appeared 
in the decay, then Ekin would not have a well defined value but would 
depend on the distribution of the kinetic energy between the neutral par- 
ticles. 
This is just the case realized in the decays of u-mesons, which proceed 
according to the scheme 


+ + 
(US So OD 5 


Here e` stands for an electron, and e* stands for a positron. 

The theory of positrons will be discussed in detail in Part V. Here we point 
out only that the positron has a positive charge of the same magnitude as the 
electron charge and a mass equal to that of the electron. The electron, 
n -meson and u` -meson on the one hand, and the positron, n*-meson and 
p*-meson on the other hand form groups of antiparticles. Antiparticles are 
produced in pairs which can be annihilated, producing y-quanta. * 

Experiment shows that in the decay of -mesons the kinetic energy of the 
electrons or positrons produced has no well defined value but varies from one 
decay to another. 

Thus, it is clear that the energy carried away by neutral particles escaping 
direct observation can be distributed between them in different ways. This 


* For the properties of elementary particles see: Y.V.Novozhilov, Elementary. par- 
ticles (Gordon and Breach, New York, 1961) (a popular and very good exposition), and 
E.V.Shpolskii, Atomnaya fizika (Atomic particles), Vol. II (Gostekhizdat, Moscow, 1951). 


282 RELATIVISTIC MECHANICS Ch. 2 


would be impossible for a decay according to the scheme which is valid for 
m-mesons, i.e. with the production of one neutrino. 

The kinetic energy of the charged particle is on the average equal to 5 + of 
the total energy of the u-meson. The remaining 2 are on the average equally 
distributed between the two neutrinos. 

Consider, finally, the decay of the 79-meson, which proceeds according to 


the scheme 


TARAN 


where y stands for a photon. y-quanta are detected by the ionization they 
produce. A certain relation is observed between the angle of divergence of the 
photons and their energy. This relation can be established from the conserva- 


tion laws. 
Before the decay the invariant of the 4-vector of energy and momentum in 
a reference frame moving together with the 79-meson reduces to 


= 2,4 
= SU 


since in this reference frame the total momentum is equal to zero. 
After the decay in the laboratory system J is equal to 


I=(p, + py)*c?-(E, +E)”, 


where P], P2, E1, E2 are the momenta and energies of the two photons. 
Hence 


= 2 
-m?c* = (p; + p2)” c? —(E, +E)”, 
. or, since the rest mass of photons is equal to zero, 


m2c4 =(p, + p2)?c* —(p, + Pp)? c? = 


2 


= 2p pc? (1 — cos y) = 4p, prc? sin“ $y, 


where y is the angle between the directions of flight of the photons. 
Consequently, 


in2 mack (16.10) 
sins OS) e 5 
2 VE Ez 


A 





§16 ENERGY-MOMENTUM CONSERVATION LAW 283 


This dependence of the angle of divergence on the energy of the photons is in 
good agreement with that observed experimentally. 

5. Electron-positron pair production by y-quanta and positron-electron 
annihilation. The experimental discovery of the theoretically predicted phe- 
nomena of electron-positron pair production and positron-electron annihila- 
tion with the production of y-quanta appeared to be a basic confirmation of 
the validity of relativistic quantum mechanics, which the reader will meet in 
Part V of the book. At the same time, these phenomena serve as good illustra- 
tions of the relations of the theory of relativity. 

Consider first of all the question of the possibility of electron-positron 
pair production by a y-quantum in vacuum. Let a y-quantum with energy 
E= pc produce an electron and a positron. It is easily seen that such a process 
is incompatible with conservation laws. Indeed, if the photon produces an 
electron-positron pair with the minimum possible energy — i.e. the rest 
energy — and a momentum equal to zero, then it has an energy E = 2mc? and 
a momentum p = E/c > 0, where m is the electron mass. If the momentum of 
the pair differs from zero, then it is always possible to transform to the 
centre of mass system where it is equal to zero, and our reasoning is still 
correct. 

Thus, the momentum differs from zero before the pair production, and is 
equal to zero after the pair production. This would clearly contradict the 
momentum conservation law, and therefore such a process is impossible. Pair 
production can take place only in the presence of a third body; usually an 
atomic nucleus, which itself takes the excess momentum. Since the nucleus 
also acquires a certain part of the energy, the energy threshold of pair produc- 
tion by a y-quantum lies above 2mc?. The threshold energy of the y-quantum 
is determined by the requirement that all the particles — the electron, posi- 
tron and nucleus — should have a momentum equal to zero in the centre of 
mass system. 

To find this threshold we shall find the invariant of the energy-momentum 
4-vector. 

Before the reaction there was a y-quantum with a threshold kinetic energy 

E thresh and a momentum p = Ethresn/C and a nucleus at rest with a mass M. 
After the reaction there was an electron-positron pair and the nucleus which 
acquired a part of the momentum and energy. 

The value of the invariant of the energy-momentum 4-vector before the 
reaction was 


E 2 
N= (“ue pa (ey an +Mc2)? 





The value of the invariant after the reaction can be taken in any reference 
frame, including the centre of mass system. The merit of the latter is the fact 
that in it the total momentum of all the particles is equal to zero and the 


284 RELATIVISTIC MECHANICS Ch. 2 
| tase 
Í value of the invariant reduces to 

] 


=— (Mc? + 2me?)? . 
Thus, we have 
2 2 2 = 2 2,2 
Mc“ + E thresh) Ta E thresh = (Me + 2mc*)" . 


Hence for the threshold energy of the y-quantum we find 


E = 2mc? (1 +m) (16.11) 


thresh M 

The threshold energy for other processes producing electron-positron pairs, 
for example the collision between two electrons, can be found in a completely 
analogous way. 

The process which is the reverse of the process of pair production by 
y-quanta is called pair annihilation. In annihilation an electron and a positron 
fuse to produce y-quanta. The essence of this process from the point of view 
of contemporary quantum theory will be discussed briefly in Part V. The 
process of annihilation usually takes place at small values of kinetic energy 
of the positron. Hence the difference between the energy of the initial state 
and the energy of the final state amounts to 


AE= mc? —(—mce?) = 2mc? = 1.02 MeV 


The simultaneous conservation of energy and momentum requires that this 
energy should be emitted in the form of (at least) two y-quanta emerging in 
opposite directions and having an energy of 0.51 MeV each. 

It has been established in a number of experiments that the annihilation of 
positrons is indeed accompanied by such radiation. The phenomenon of the 
annihilation of positrons serves as one of the most effective. confirmations of 
the mass-energy relation. It should be noted that the process of annihilation 
is sometimes incorrectly interpreted as “the transformation of matter into 
energy” or as “the disappearance of particles”. It is clear that the y-quanta 
arising in the annihilation of a positron are as material as the electron and 
positron. 


§16 ENERGY-MOMENTUM CONSERVATION LAW 285 


In §13 we have already discussed the properties of the rest mass and con- 
servation laws in the theory of relativity and have seen that the transforma- 
tion of particles with a rest mass different from zero into particles having no 
rest mass can in no way be considered as the disappearance of particles or the 
transformation of mass into energy. 

.6. As the last example of the use of conservation laws in nuclear physics 
we shall consider the determination of the energy threshold of the reaction in 
which a m-meson is produced in a proton-proton collision: 


1 1 2 + 
iH+1H> Dtr, 


where 7” is a positive 7-meson with mass m,- 
We write the value of the invariant E? — p2c? before and after the colli- 
sion. Before the collision, 
(E + 2Mc”)? —p2c? = (E + 2Mc?)? — 


thresh thresh 


2 
—Ernresh thresh + 2Mc*). 


Here we have assumed that one of the protons was at rest before the collision 
and made use of formula (16.8). If the incident proton has the threshold 
energy, then the particles produced in the ‘reaction, the deuteron and n*- 
meson, have ‘the minimum possible energy, which corresponds to the rest 
energy in the centre of mass system. In the centre of mass system one can 
write 


E?- pc 2 = (2Mc? +m Taa 
From the conservation of the invariant (E? — p?c?) we have 


22 aN 2 2)2 
thresh + 2MC°)” — Eth reshP tnresh + 2Me“) = (2Me“ + mic) 


Hence the threshold value of the kinetic energy is 


(2M +m,)?-4M?}c? _(4m,M + m2)c? 


E thresh ~ 2M 2M 


=m c? (2+—") =292 MeV (16.12) 
on ; ; 





286 RELATIVISTIC MECHANICS Ch. 2 


This value of E thresh is in good agreement with the measured value. 

The examples quoted are of a purely illustrative character, and have been 
used intentionally to introduce the diverse problems of nuclear physics. They 
show, however, that in all nuclear processes in which rather considerable 
changes in the energy of the system are to be taken into account the laws of 
the theory of relativity and. in particular, the mass-energy relation play a 
fundamental role. s 


§17. The theory of collisions between relativistic particles. Compton effect 


The theory of collisions between relativistic particles is of great importance 
for nuclear physics. In the absence of nuclear reactions between the colliding 
particles the interaction between them can, with a sufficient degree of accu- 
racy, be considered as an elastic collision (i.e. a collision without a change in 
the internal state of the nuclear particles). This refers, in particular, to colli- 
sions between elementary particles, for example, mesons, protons or photons 
with electrons. $ 

We shall consider first of all the elastic collisions of particles with a rest 
mass differing from zero. We assume that a fast particle with mass u and 
momentum _p collides with another particle having a mass m. We consider the 
second particle to be at rest and free. Such an approximation is legitimate for 
a sufficiently large velocity of the incident particle. After collision the particle 
which was initially at rest will move with a momentum py, directed at an 
angle y to the momentum of the incident particle. The latter will be deflected 
from the initial direction of flight by an angle @ and will have a momentum 


P2 





Fig. 11.4 





§17 COLLISIONS BETWEEN RELATIVISTIC PARTICLES 287 


We can write the energy and momentum conservation laws in the form of 
(15.7) and (15.8): 


P=P,+P>, (17.1) 


me? +E=E,+E,, (17.2) 


where £ and E, represent the energy of the incident particle before and after 
collision respectively, and Æ} is the energy after collision of the particle 
which was initially at rest. These relations are sufficient to find all of the 
quantities characterizing the process of collision. 

Suppose, for example, that we have to find the energy acquired by the 
particle which was initially at rest as a function of the angle vy. It is simpler to 
find, not the total energy £} of the particle, but its kinetic energy E} kin) = 
E, — mc?. From (17.2) we have, obviously, 


|p? + pect = pze? + p2c4 + E(Kin) (17.3) 
Since on the basis of (17.1) 


p3 =p> +p; —2pp, cosy, (17.4) 


we can eliminate p) from (17.3) and (17.4). Moreover, p? can be expressed 
in terms of £} by TERE of (16.8). Rearranging and squaring (17.3) and sub- 
stituting the value for pS from (17.4), we obtain after some manipulation 


ppc? cos y= Elkin) ¢\/p2c? c? +y2ct + me?} . 


Squaring the above relation and expressing Pe again in terms of EUA we 
finally obtain 


Ekin) 2 2mp2c* cos? Y 


ae 5 (17.5) 
{Vp2c2 + p2c4 + me2}2 — p2c? cos? y 
i.e. the dependence on p and y of the energy transferred to the motionless 
particle, as well as the dependence on the masses m and u of the particles. 

Formula (17.5) shows that the energy transferred has its largest value for 
y = 0, i.e. for the motion of the initially stationary particle in the direction of 
flight of the incident particle (i.e. head-on collision). Then, 


288 RELATIVISTIC MECHANICS Ch. 2 





yp) 
E)E = 2me? pe -. (17.6) 
{ /p2c2 + u2c4 + mc?}? —p?c? 

Consider some particular cases of formula (17.6). Let, for example, the 
incident particle be a proton or a meson, and let the particle at rest be an 
electron. Then u >> m. In addition, we assume the incident particle to be 
very fast, so that pe >> uc?. Then from (17.6) we find 

2¢2 


Ep) = 2me2 N A 
(ED max u2c4 + 2mc2/p2c? + u?c4 + m2c4 


Jo Pe 


2mc?pc + u?ct i 


(17.6') 


= 2mc 


If the momentum of the incident particle is so large that the inequality 


p2c4 u 


mce m 





pe > uc? 


is fulfilled, then the maximum transferred energy amounts to 
Emax ~Pe~E, 


i.e. to the energy of the incident particle. 

Now consider the case when the incident particle is a light particle, for 
example, an electron, and the particle at rest is a heavy particle (i.e. u << m). 
If, moreover, the first particle possesses a momentum pe >> uc?, then from 
(17.6') we obtain for the energy transferred 

(kin) — 5,,,2 pace 
(Emax ~ 2c AE TA 


If the inequality pe >> mc? is satisfied, then 


(ED = pe +E. (17.7) 


Thus, we see that for very large momenta the laws for elastic collisions in 
relativistic mechanics differ in a fundamental way from the analogous laws in 
classicai mechanics. For sufficiently large momenta a total energy ‘transfer 





§17 COLLISIONS BETWEEN RELATIVISTIC PARTICLES 289 


from a heavy particle to a light one is possible, and also from a light particle 
to a heavy one. It should be noted that in this case in classical mechanics the 
elastic collision is accompanied by only a negligible energy transfer. * 

Other limiting cases can be easily obtained from the general expression 
(17.5). In particular, it is easy to show that for small momenta pe << uc? and 
pe << mc? the energy transferred is given by a formula which is the same as 
the corresponding expression for the non-relativistic theory of collisions. 

An important case of elastic collision is the collision of a photon with an 
electron. This phenomenon, called the Compton effect, was first studied 
carefully in connection with the elucidation of the quantum nature of light. 
The Compton effect plays a role in a number of practical problems of con- 
temporary nuclear physics. The photon has a rest mass u equal to zero and 
E = pc, Ez = pzc. Hence formulae (17.2) and (17.1) assume the forms 





E=E, tE, (17.8) 
E\? /E,\? 2EE 

p? = (=) + (>) = 2 cos, (17.9) 
(9 C c2 


where BaD = VP; c? + m2c4 -= m,c?, m, is the mass of the electron, and 6 
is the Ee RE the direction of flight of the photon before and after 


collision (Gaios: angle). 
Expressing p in eq. (17.9) in terms of the kinetic energy of the electron 
by means es we write (17.9) in the form 


(EGKIN))2 + 2m c2E (Kin) = E? + E} — 2EE, cos@. (17.10) 


First of all we find the energy of the photon undergoing the collision. Elimi- 
nating jasi) from (17.8) and (17.10), we easily find 


ne 
mc “E 


= ——. (17.11) 
mec? + E(1 — cos 0) 


This formula relates the energy of the scattered photon to that of the incident 
one and to the scattering angle 8. 

Transforming from energies to wavelengths according to the well-known 
quantum formula Æ = hv =hce/A, one can obtain the following value for the 
decrease in the wavelength Ad in Compton scattering: 


* See §43 of Part I. 





290 RELATIVISTIC MECHANICS Ch. 2 





ar=te( 2-2) = are Cl cos 8) = A(1 — cos 8) . (17.12) 
The quantity A = h/m,c = 0.0242 A is called the Compton wavelength. 
Formula (17.12) shows that the change in the wavelength is independent 
of the wavelength of the incident radiation. The maximum change in the 
wavelength is equal to 2A. 
Knowing the energy of the scattered photon, one can easily find the 
energy transferred to the electron. It turns out to be 





w _ A(l—cosð) 

glkin) = p E, =F AC —cos 8) R (17.13) 
i.e. relatively not large for A >> A. On the contrary, for \~ A the energy 
transferred to the electron turns out to be considerable, of the order of mag- 
nitude of E. 

The importance of this resuit lies particularly in the tact that it is of a 
general character. We have more than once pointed out in Part I that classical 
electrodynamics contains in itself the limits of its applicability. Namely, 
classical electrodynamics become inapplicable in the region of small distances, 
of the order of magnitude of the classical radius of the electron rg = 
2.5 X 10-13 cm. We have mentioned, however (see §29 of Part I), that in fact 
the limit of applicability of the classical theory lies much higher, at distances 
of the order of 2 X 10-10 cm, which corresponds to the order of magnitude 
of the Compton wavelength. 

The unsuitability of classical electrodynamics for phenomena taking place 
in a region of space with a linear dimension of the order of A is obviously 
associated with the fact that in this region quantum effects begin to make 
themselves felt. The scattering of light serves as a concrete example of this. 

For A >> A the change in the wavelength of the scattered light and the 
energy transferred to the free electron are relatively small. Hence the scatter- 
ing of light is sufficiently well described by the classical theory. The scatter- 
ing is coherent. The change in the wavelength Ad ~ A is very small in com- 
parison with the wavelength itself; no energy transfer to the free electron 
takes place, and the cross section for the scattering is given by the Thomson 
formula (36.13) of Part I. 

As À approaches A (hard X-rays and y-rays) the classical scattering is re- 
placed by the Compton effect. The change in the wavelength of the light 
becomes comparable with the wavelength. The radiation knocks out recoil 
electrons (moving mainly forward in the direction of the incident photon), 


CC oo 





§17 COLLISIONS BETWEEN RELATIVISTIC PARTICLES 291 


and the cross section for scattering, as is seen in fig. 1.17, §36, decreases with 
the energy of the photon. The phenomenon assumes a clearly pronounced 
quantum character and a classical treatment of the phenomenon of scattering 
becomes quite impossible. 

The quantum-mechanical calculation of the cross section for Compton 
scattering will be given in Part V. 


į 





Da 


Relativistic Electrodynamics 


§18. Charge conservation, the four-dimensional current 
and the equation of continuity 


We shall now turn from relativistic mechanics to relativistic electrodynam- 
ics. We shall take as the basis of relativistic electrodynamics the assumption 
of the invariance and conservation of electric charge. The charge is a funda- 
mental quantity characterizing the. properties of particles, and the charge 
constancy is strictly observed in all known physical processes. 

The charge conservation law 


: V- (pu) + S2=0 (18.1) 


must hold in all inertial reference frames. In order to give a relativistically 
invariant form to the charge conservation law, one can, following the usual 
method, write it in four-dimensional form. 

For this it is sufficient to introduce the 4-vector jẹ called the four- 
dimensional current and defined by the relation 


ją ~ (pu, icp) . (18.2) 


Then (18.1) is easily written in four-dimensional form: 


292 


n_e 


§18 CHARGE CONSERVATION 293 


Oia ey Oly; Oe 9, 
Oxi me OY) | OZ 07; 





=0. (18.3) 


It is formula (18.3), written in four-dimensional form, which is the relativisti- 
cally invariant expression. Hence it follows that the quantity jg, which we 
determined formally, indeed represents a 4-vector. 

From the definition of the 4-vector jg it follows directly that, in passing 
from one reference frame to another, its components must transform accord- 
ing to formulae (11.1)—(11.4). If this transformation law is applied to the 
fourth component j,=icp, then we obtain the invariance of the electric 
charge, de = p dV which is present in an arbitrary volume element dV. Indeed, 
let us consider the change of j, in the transition from the reference frame K' 
in which the charges are at rest to the reference frame K. The reference frame 
K' moves relative to K with a velocity v. In the reference frame K’ the 
velocity u = 0 and, for the vector jæ, only the component j, differs from zero. 

.From the formula (11.4) for the transformation of the fourth component 
of the 4-vector we find the expression for the change in the charge density: 


p= ) (18.4) 


Multiplying (18.4) by the volume element dV, we have 


p dV = de = pay 


The change in the volume (6.2) gives 


2 
ave avix/ta A, 
c2 





so that 


pdV=p'dv'. 


Thus, the charge in any volume element is invariant under Lorentz trans- 
formations. This can obviously be interpreted in the following way: as the 
volume decreases because of the Lorentz contraction the charge density 
increases in the same ratio, so that the total charge is not changed. 


Ee a OE 


294 RELATIVISTIC ELECTRODYNAMICS Chis) 


§19. The relativistically invariant formulation of the equations 
of the electromagnetic field potentials 


We have pointed out earlier that the theory of the electromagnetic field 
was formulated “correctly”’ from the point of view of the theory of relativity 
from the very beginning. This means that the Maxwell-Lorentz system of 
equations is relativistically invariant, satisfying the requirements of the theory 
of relativity. 

One can verify this most simply by considering the equations for the 
potentials. 

According to what was said in §10 of Part I the system of equations 


VER aoe =——pu, (19.1) 

c c 
2 aoan 2 
72 372 47P , (19.2) 


taking into account the gauge condition 


V- A+ 1 0, (19.3) 


is completely equivalent to the Maxwell-Lorentz equations. 

The relativistic invariance of the system (19.1)—(19.3) follows directly 
from the fact that this system can be written without any changes in the 
four-dimensional form. 

Indeed, we multiply (19.2) by i and note that then the right-hand sides of 

i eqs. (19.1)—(19.2) contain the components of the 4-vector of the current 
density. Hence it follows that also the left-hand sides represent the compo- 
nents of a 4-vector which we shall call the 4-potential Ag 


A, =(A, ip). 
By means of the 4-vectors Ag and jy eqs. (19.1) and (19.2) can be written in 
the form 
4n. 
DAg=— iq (19.4) 


where O denotes D’Alembert’s operator (the D’Alembertian): 





§19 ELECTROMAGNETIC FIELD POTENTIALS 295 





(19.5) 


The gauge condition may be written at once in the four-dimensional form 


aAy 


ake (19.6) 


Thus, the complete system of equations for potentials is written in the 
relativistically . ariant form. This means that the laws of electrodynamics 
are the samc in all inertial reference frames. 

Further, we see that the electromagnetic field potentials and, consequent- 
ly, the electric and magnetic field vectors, are not themselves invariant quan- 
tities. The relative character of the values of the field vectors is in no way 
something new and unexpected. It is sufficient to recall that a moving charge 
produces a magnetic field which, however, is absent in a reference frame 
moving together with the charge. 

The law of transformation of the potentials can easily be written, since 
they transform according to the general formula for the transformation of 
the components of a 4-vector (11.1)—(11.4). 

We have, obviously, 


A‘ +29 
A anm, (19.7) 
ae 
2 
ANA (19.8) 
A,=A\, (19.9) 
AD An 
VRIES 
9=——.. (19.10) 
n- 
c2 


In the following paragraphs we shall apply these formulae to the treatment of 
the electromagnetic field of moving charges. 





296 RELATIVISTIC ELECTRODYNAMICS Ch. 3 
§ 20. The field of a moving charge 


Consider the simple problem of finding the electromagnetic field produced 
by a uniformly moving charge, when the velocity v of the charge is com- 
parable with the velocity of light. In the reference frame K’ moving with the 
charge the magnetic field is absent, and the electric field potential is expressed 
by the formula 


yg =e/r'. 


In the inertial reference frame K with respect to which the charge moves with 
a velocity v the electric field potential, according to (19.10), has the form 


’ 





p=—£ = € : (20.1) 
2 2 
Vil Nl 
c2 c? 


We need the formula for the transformation of the length of the radius 
vector. We have 


r= GVO +E = Ye “4y2 422 
/ v2 








The asymmetry of this formula is associated with the fact that the motion 
takes place along the x-axis. 
Thus we find 


e 
Se (:-4) (? +22) 


We note that x = vt, y = 0, z = 0 represent the coordinates of the point at 
which the charge is found at an instant ż, so that the radius vector drawn 


y= (20.2) 





§20 FIELD OF A MOVING’CHARGE 297 
from the charge to the point of observation N (x, y, z) can be written in the 


form 
r=i(x—vft) t+jy+kz (20.3) 


r=VJ/(x—vt)? +y? +2? (20.4) 


It is convenient to simplify (20.2), expressing y in terms of r and the angle y 
formed by the vector rand the x-axis (fig. II.5). 


Oe x 
u ——————— 4 
P Vy2+z2 
N 


Fig. 11.5 


and 





From fig. II.5 it is seen that 
x—vut=rcosy, 
y? +z? =r? sin? Y, 


and, consequently, 


2 
V(x—vt)? + h Noe +z?) = ry cos? y+ (: — sin? y= 
c 


c 
2 
HA OP 
=r/1——sin2 y. 
c2 
Hence we have 
y= 2 (20.5) 
2 
ry1-% sin? y 
c2 


Passing on to the calculation of the vector potential A we note that in the 
reference frame K' there is no magnetic field and A'=0. From formulae 








298 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


(19.7) and (19.10) we find 


ep = 


vi-% 


y= 





ale 


ANE 


alie 
[S] 


ev ev 


2 2 
alem vt)? + (: -*) Q? +z?) ey —U sin? y 
c? c2? 








In vector form this can be written 
A= v/c (20.6) 


Knowing the field potentials one can then find the fields themselves. 

In calculating the electric field E it should be kept in mind that in our case 
of a uniformly moving charge the differentiation with respect to time reduces 
to the differentiation with respect to the coordinate x according to the for- 
mula 


ð 


Or o 


Indeed, in a uniform motion A and y depend on the coordinate and time 
according to f(x — vt). For such a function 


v a Xew of 
dt oe o Da? 





from which we find 





§20 FIELD OF A MOVING CHARGE 299 





v2 ey 
= (E 2 sive 
: (e-o? + (1-4) 02 +2} 
a2 
J 
& 
dy 102A, 
E pS ee 
z ðz c or 





or, in vector form, 


z v2 er f 
D= (: =5) oe (20.7) 
r3 h — — sin? v) 
c2 


When v <c formula (20.7) goes over, of course, into the electrostatic ex- 
pression for the field of a charge at rest. 
We can also find the magnetic field H easily: 


H= Vx A=2VX (ve) =—2vX Vo=4vXE. (20.8) 


Formulae (20.5) and (20.7) show that, in contrast to the field of a charge at 
rest, the electric field of a moving charge does not have spherical symmetry. 
The scalar potential y has a constant value at the surface of the ellipsoid: 


2 
@- vt)? + ( - “) (y? +z?) = const . 
c 


. This ellipsoid is obtained from a sphere compressed in the direction of the 
x-axis by a factor V 1 —v2/c2. 

The character of the field distribution is most clearly seen from the for- 
mula 


300 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


(20.9) 





Qn the x-axis (y = 0) the field is less than the electrostatic field by a factor of 
(1 —v2/c2), while in the plane perpendicular to the x-axis it is increased 
in the ratio 1/./ 1 —v2/c2. 

As the velocity increases the equipotential ellipsoid gets more and more 
oblate, while the value of the field in the direction of motion decreases and in 
the perpendicular direction increases. When v ~ c the field is concentrated in 
_a small angular interval near the plane perpendicular to the direction of mo- 
tion. The width of this interval is AY ~ y 1 — v2/c2. 

The magnetic field H is always perpendicular to the direction of motion 
and to the electric field vector E. For u~c, IHI ~ | El. At small velocities 
(v << c), it can be assumed that 


vXr 
r? 





(20.10) 


< 
x 
w 
Q 
a In 


if it is also assumed that E has approximately the electrostatic value. This last 
formula is the same as (20.5) of Part I. 

The field produced by a moving charge was found by classical methods * 
even before the appearance of the theory of relativity. There is nothing sur- 
prising in this, since the Maxwell-Lorentz equations are relativistically invari- 
ant. 

Let us apply the formulae obtained to the calculation of the force of inter- 
action between two charges e} and e, moving with the same velocity v with 
respect to the laboratory system. In the reference frame connected with the 
charges the force is, obviously, equal to F= (e,e>/r3)r and is directed aiong 
the vector connecting the charges. We choose the direction of the vector y to 
be the x-axis. 

From the point of view of the laboratory system the field of charge e is 
expressed by formulae (20.8) and (20.10). The charge e, is acted upon by 
the Lorentz force 


* See, for example, R.Becker, Electromagnetic fields and interactions, Vol. 1 (Blackie, 
London, 1964) p. 267. 


Ee ————@ou“N n 


§20 FIELD OF A MOVING CHARGE 301 


i e 
F=e, (E+ -vx H) =e,E+ 2 vx (vx E) = 
c Co 


eo 2 v2 
WDS v“E=e; (: = 
c Ca 


€? 
@ 


WE, 





=e, E+ )E+e, 


c? 


This force is no longer directed along the radius vector. The component of 
the force in the direction of motion is equal to 


2 
eje (1-35) cos v 
F, =e,E, = s 


x 2 3/2 ` 
r2 (1 —  sin2 v) 
c2 





The component of the force in the perpendicular direction is equal to 


v2\2 
ejer (: -=) sin Y 
a (i v? gie c2 
ma? 2) "y ‘a 3/2 | 
r2 \1— 5 sin? v) 


c 


Before the appearance of the theory of relativity it was assumed that, by 
observing the interaction between moving charges, one could determine their 
absolute velocity with respect to the ether. However, attempts at such meas- 
urements did not lead to any positive results. 

In the light of the theory, of relativity the error in this reasoning is clear: 
only the common relative velocity v of the two charges enters into the for- 
mula for the force measured in the laboratory system. 

It is interesting to apply the result obtained to the motion of a uniformly 
charged sphere. In the proper reference frame moving together with the 
sphere the charge density is constant and the lines of field are normal to the 
surface. In the laboratory system the repulsion between the charges must lead 
to a non-uniform charge density distribution over the sphere. If the polar axis 
is drawn in the direction of motion, then the charge density must be largest 
at the poles of the sphere and a minimum in the equatorial plane. 

From the non-relativistic point of view, the absolute velocity of motion of 
the sphere could be found by measuring this charge distribution. In reality 
this is not the case: from the point of view of the laboratory system a moving 
sphere must undergo a contraction and turn into an ellipsoid. Calculation 


7 = m 


302 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


shows that the effect of contraction of the sphere compensates exactly for 
the effect of the charge accumulation at the poles. As a result the charge 
density measured in the laboratory system will be constant over the entire 


sphere. 
This example is very interesting because it shows how the Maxwell-Lorentz 


theory, “correct” from the point of view of the theory of relativity, leads to 
false results in association with the classical notions of space and time. Only 
an alteration of the ideas of space and time in conjunction with the laws of 
electrodynamics allows one consequently to bring the theory into agreement 
with experiment. 

In addition to the field of a uniformly moving charge it turns out to be 
possible to find the field of a charge performing an arbitrary motion. 

In §25 of Part I we found an expression for the Liénard-Wiechert poten- 
tials and pointed out that the corresponding expressions do not lose their 
applicability even at velocities close to the velocity of light. 

To convince ourselves of this, we write the Liénard-Wiechert potentials in 
the four-dimensional form. 

We introduce the 4-vector Rg having the components (r—rg), ic(t—7), 
where r is the coordinate of the observation point, ro is the coordinate of the 
charge, and 7=t—Ir—rol/c. The relation between the components of the 
vector Rg is given by the formula 


7, = 
Rý- OF 
Indeed, substituting into it the components of Ry, we have 
(r—ry)? —c*(t-1)? =0. 


By means of the 4-vector Rg one can introduce the Liénard-Wiechert 
4-potential by the relation 


A= —CUg/R gg ; 


where the summation is carried out over the index 6 (B =x, y, z, 7). Indeed, | 
making use of the definition of the 4-velocity u, (11.6), we have 


es 





eer 


§20 FIELD OF A MOVING CHARGE 303 


Ryu, = Ryu, o +Riu, ERU E 











323) SO =) Se — 
2 
v v 
e 1- 
c? c2 
ic 
+(z— zo) +ic(t—7) = 
= v2 
hae 1- 
22 z2 
c (r—ro): y 
= | 2 c(t »] = 
v2 c 
j ees 
3 
c (r—To9)°V 
= | o) Ir rl = 
mae 
DP 


where, in correspondence with the notation of §25 of Part 1, R(T) =r—rg is 
the radius vector drawn from an instantaneous position of the charge to the 
observation point at an instant 7. 

Correspondingly, the components of Ag have the form 


ev. 1 
A yer ee = 


OTB Te 


c2 





ev. 
= ee (20.11) 


-[ro POX] 


with analogous expressions for A, and A,; 





304 RELATIVISTIC ELECTRODYNAMICS Ch. 3 











A_=iy= iec 1 à 
T c [Rox ro] 
e2 / v c 
E 
ie 
Ro _ RO 
Hence 
y= (20.12) 
[ko RO )- 





Formulae (20.11) and (20.12) are identical with formulae (25.3) and (25.4) 
of Part I. 

Thus, the Liénard-Wiechert potentials can be written in relativistically in- 
variant form and, consequently, use can be made of them at an arbitrary value 
of the velocity of motion of the charge. 


§21. The electromagnetic field tensor and Maxwell’s equations 


We now turn to the calculation of the components of the electric and 
magnetic fields. 

Making use of the definition of the 4-potential and introducing the coor- 
dinates of the radius-4-vector, we can write the components of the electric 
field in the form 








(21.1) 


oor — — 


§21 ELECTROMAGNETIC FIELD TENSOR 305 


while the components of the magnetic field are expressed in terms of the 
components of the vector potential by the usual relations 


ðA, 0A ) 
Hae = 21. 
x ( oy On (21.2) 


and analogously for H, and H,. 

The symmetry of formulae (21.1) and (21.2) suggests that we try to write 
the whole set in the form of one general formula. To do this we introduce a 
tensor Fag by means of the relation 


QA, Aq 


OB” Ixy OXy ` (21.3) 


The tensor Fy, is antisymmetric by definition. The calculation of the com- 
ponents of Fy, leads to the result 


' 0 H, -H, —iE, 
-H 0 H -iE 

aa a x l (21.4) 
H, -H, 0 —iE, 


We see that all components of the electric and magnetic field vectors prove to 
be components of one tensor quantity Fap: It turns out that Maxwell’s equa- 
tions represent a system of equations for the tensor Fap- That is, if we write 

for Fa, the equation 
dFag OF gy $, OF» 


Ox, xa OX g Oe Cr) 
then, assuming a, 6 and À to be successively equal to 1, 2, 3 and 4 and making 
use of the definition (21.4), we find easily that the four-dimensional equation 
(21.5) represents a notation of the two Maxwell-Lorentz vector equations 
(8.1) and (8.2) in Part I. 
Analogously, writing the four-dimensional equation 
OF, 4n. 
I key (21.6) 
Ox g c 


a a N Se 


306 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


we can verify that it involves the two vector equations (8.3) and (8.4). For- 
mulae (21.5) and (21.6) represent the relativistically invariant form of nota- 
tion of the Maxwell-Lorentz system of equations. The indissoluble connection 
between the electric and magnetic fields appeared in the non-relativistic 
formulation of the theory of the electromagnetic field as a proposition which 
followed from the whole of the experimental data. In relativistic electro- 
dynamics this connection proves to be inevitable and self-evident. From the 
very definition of the field it follows that its properties are characterized not 
by two vectors but by one antisymmetric tensor. The equations of the electro- 
magnetic field are equations related to this tensor. The components of the 
fields E and H appear equivalently in the tensor Fg,. Under Lorentz trans- 
formations, which we can now formulate, the fields Band Hare expressed in 
. terms of each other. 

Indeed, according to formulae (11.21) we have (assuming Axy =H,, 

A Hy, Ay Hy A, =—iE,, Ay, =E, Az; = iE): 


, U jy’ t, U yy 
i E, -—H, E, +H, 
E =E., E =———, £,=-—“—, (21.7) 
x x oy, 2 Zz 2 
to (| ee 
c? c? 
1 U yt ih Uy 
Be H Spinich 2 gies ory 
ee ag i, =. (21.8) 
1—2 —— 
z7 72 


We see that statements of the type “the field is of purely electric character” 
or “the field is of purely magnetic character” are relative. An electric or 
magnetic field can be equal to zero in one reference frame and different from 
zero in another. Hence it makes no sense to attribute physical reality to the 
electric field and magnetic field separately. It is their combination, expressed 
by the electromagnetic field tensor Fg, which is the physical reality. 

It is natural to establish which quantities characterizing the electromag- 
netic field are invariant. Since the electromagnetic field is described by the 
antisymmetric tensor Fog then according to the results presented in §11 
these invariants are the quantities FagFag and Fy,Fy, FF q- The cube 


uv” va’ 
invariant FogF gafra turns out to be equal to zero. A simple calculation gives 


SE EEE 


Eee, —_ —— 


§22 DOPPLER EFFECT, MOSSBAUER EFFECT 307 
2 aha 
I mor - E? = invar , (21.9) 
o E 
PE Barini ot va EMES invar . (21.10) 


The invariants /; and /, are the absolute characteristics of the field. The 
statement “the electromagnetic field is equal to zero” (J; =/ = 0) or “the 
electric ana magnetic field have the same value and are perpendicular to each 
other” (J, =/ = 0) are examples of absolute statements. 

It is also true that, if J, =, i.e. if the fields E and Hare perpendicular to 
each other, then they will remain perpendicular to each other in all reference 
frames. If in this case J; > 0, then in all reference frames | H| > | El. We can 
in such cases find a reference frame in which the electric (but not the mag- 
netic) field is equal to zero. Analogously, if 7) <0, one can find a reference 
frame in which the magnetic field is absent. 

In the limiting case v << c the Lorentz transformation formulae are essen- 
tially simplified. Disregarding v2/c?, they can be written in the form 


p-p VXH i H-H +Y E ‘ 
c c 








(21.11) 


These transformations had already been obtained in non-relativistic field 
theory. 

In conclusion we note that the relation and resemblance between the 
fields E and H does not at the same time exclude the existence of an essential 
difference between them, which we have mentioned in §7 of Part I. In 
relativistic electrodynamics this difference manifests itself in the fact that the 
components of the electric field are the temporal (imaginary) components of 
the electromagnetic field tensor, whereas the components of the magnetic 
field form the set of its spatial (real) components. 


§22. Some applications: Doppler effect, Mössbauer effect, observation of 
rapidly moving bodies, transformation of angles, intensities and cross 
sections 


We have seen that the statement “the electromagnetic field is absent at a 
certain point of space at a particular instant of time” has an absolute charac- 
ter. From this it follows that the value of the phase « in the electromagnetic 

E=Ege*, H=Hoei% 


> 


308 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


wave is an invariant. If, for example, the phase is equal to }7 or to an integer 
multiple of $7, so that E = H=0, then this value of the phase must be con- 
served in all inertial reference-frames. 

Formally introducing the 4-vector ky, 


by . Ww 
ka= (ki), 


called the four-dimensional wave vector, one can write the phase of the wave 
in the form of the scalar product of two 4-vectors, kg and rg, 


a= kya” (Kr - ot). 


Since the phase is invariant, the above formula shows that the quantity kg, 
determined formally, is clearly a 4-vector. 

The law of transformation of thë components of a 4-vector also allows one 
to find the law of transformation of frequencies in the theory of relativity. 
That is to say, from the definition of the four-dimensional wave vector it 
follows that its fourth component transforms according to the law 


! JU 
k, tiz kx 
k,=——— 


T 
2 
fis 


Expressing k_ in terms of the frequency w, we find 


1 1 
w tvk, 
w= ————. 
joe 

c2 


Writing k, in the form k} = (w'/c) cos 6’, where cos 6’ is the direction cosine 
of the wave vector, we find 
w (: + cos a’) 
es Ce (22.1) 
2 
ees 
c2 


Analogously, 


(EEE a 





§22 DOPPLER EFFECT, MOSSBAUER EFFECT 309 
a v 
Ke—ik = cos 6’ +— w 
c c 
k =2n = f= (22.2) 
x c x 2 2 
v v 
VRE ees 
c2 c2 


Expressing w in terms of w’, we find 


v 

cos 8’ +7 

cos 05 

1+ =cos @’ 
c 


22.3 
y me (22.3) 
c2 


sin 0 = sin 9’ —. 7 
1+—cos 6’ 
c 





The formulae for the inverse transformation from the reference frame K to 
the reference frame K’ read: 


w (1-2 cos) 
rope c 


w = — , (22.4) 
2 
ee 
c2 
cos 0 -2 
cos 9’ = . 
1— cos 8 
c 
JÆ (22.5) 
c2 


sin 6’ = sin 0 5 R 
l— -cos 0 
c 


Formula (22.1) expresses the Doppler effect in the theory of relativity. As 
is well known, the Doppler effect consists in a change of the frequency 


emitted by a moving source in comparison with the frequency emitted by the 
same source at rest. 





310 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


Formulae (22.3) and (22.5) give the law of angular transformation. The 
law of angular transformation given by these relations is the same as that 
following from (7.4). One can verify this most simply by dividing the lower 
formula in (22.3) by the upper one. 

In these formulae w’ is the frequency measured in the reference frame 
moving together with the source of radiation, and w is the frequency in the 
reference frame at rest. It is assumed that the source moves together with the 
reference frame K’ along the x-axis. The angle 9’ represents the angle between 
the direction of the emitted wave, characterized in the reference frame K’ by 
a vector k , and the direction of motion of the source. In the transition to the 
reference frame at rest the angle @’ transforms into the angle 0 between the 
vector k and the direction of motion. 

Formula (22.4), which can be written in the form 


2 
w = 
c2 
w = — (22.6) 
1 - 2 cos 0 
c 


is the most important for practical use. Formula (22.6) allows one to find the 
frequency of light w as a function of the frequency of light w’ emitted by the 
moving source in the proper reference frame K’ and the angle @ measured in 
the reference frame K. 

If the source of light approaches or moves away from the observer who is 
located, for example, at the origin of the reference frame K, then one speaks 
about the longitudinal Doppler effect. For the approaching case cos 0 = 1 and 

j the frequency w> w’, and for the receding case cos 0 = —1 and the frequency 
; w<w. 

It is interesting to compare the relativistic formula for the change in the 
frequency in the Doppler effect with the classical one. The latter is obtained 
from the elementary treatment of waves emitted by the moving source and 
reaching the observer, and has the form 


ww (: +? cos o) ' (22.7) 


The value of the angle was assumed in classical physics to be invariant in all | 
reference frames. 

Comparing the classical and relativistic formula for the change in the fre- 
quency in the Doppler effect we see that the two formulae are the same when 


SSL 


§22 DOPPLER EFFECT, MOSSBAUER EFFECT 311 


the source of radiation is moving with a velocity v << c, so that it can be 
assumed that 


and 


l 
~] +? cosð. 
c 


1— cos 9 
c 


However, if the second order terms are taken into account, then an essential 
difference arises between these formulae. In particular, when the source is 
moving in a direction perpendicular to the direction of observation, ie. when 
0 =3T, the classical formula (22.7) shows that no change in the frequency 
takes place. On the contrary, according to the theory of relativity a change in 
the frequency does take place (the “transverse” Doppler effect) which is, 
according to (22.6), equal to 


2 
G=, (22.8) 


c2 


The experimental study of the Doppler effect was of partıcularly great 
importance because the change in the frequency is connected directly with 
the change in time in the transition from one inertial reference frame. to 
another. The experimental investigation of the Doppler effect made it possi- 
ble to confirm to a high degree of accuracy the validity of the relativistic 
relations. 

In Ives’ experiments * the change in the frequency emitted by hydrogen 
atoms in canal rays was investigated. The velocity of the atoms was about 
6X 10-3 c. The main difficulty lay in the separation of the second order 
effect with respect to (v/c), which is small in comparison with the ordinary 
(classical) line shift by an amount of the order of magnitude of v/c. To do 
this light rays emitted along and against the direction of motion were matched 
by means of mirrors. According to the relativistic formula (22.1) the value of 
the mean of the emitted frequencies of the lines is equal to 


* Ives’ and Stillwell’s. See W.Pauli, Theory of relativity (Pergamon, London, 1958). 


312 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


aw’ (1-2) a (+2) 
w tw c c 
= + = 


2 2 
2 DE 25/1 — — 
2 2 


z a yal |: omi (22.9) 





@= 





The classical result would be © = w’. 

Measurements confirmed to a high degree of accuracy the relativistic for- 
mula (22.9). 

Ives’ experiments, dating from 1938, appeared to be the first direct experi- 
mental confirmation of the relativistic law of change of time under Lorentz 
transformations. From this point of view they played a role analogous to 
Michelson’s experiments. Taking these two experiments as a basis, one could 
have constructed the entire scheme of the.theory of relativity, if both these 
experiments had preceded the development of the theory. 

An extremely accurate measurement of the second order Doppler effect 
has recently become possible through the discovery of an important new 
phenomenon, the so-called Mossbauer effect. It is well known that y-quanta 
emitted by nuclei are monochromatic to a high degree. The width of the 
corresponding spectral lines is smaller, by many orders of magnitude, than the 
natural width of the lines emitted by atoms in optical transitions. This fact 
was for a long time an obstacle to the observation of resonance absorption by 
nuclei, i.e. the absorption of y-quanta of natural frequencies by the nuclei. 

Nuclei, as well as atomic systems, absorb radiation of the same frequency 
which they themselves emit, very strongly. However, in contrast to emitting 
atomic systems, nuclei undergo a considerable recoil in emitting y-quanta. 
This recoil changes the frequency of the emitted y-quantum, which, owing to 
the very small width of the lines, leads to a complete shift from the resonance 
frequency. 

Mossbauer showed that the situation is fundamentally changed if the 
nuclei emitting y-quanta are in a crystal lattice. The forces of interaction 
between atoms and their neighbours in a crystal lattice is very large. Hence 
the recoil momentum acquired by the nucleus in emitting a y-quantum is 
insufficient to pull the nucleus out of its position in the lattice..The recoil 


—E — 
i 


§22 DOPPLER EFFECT, MÖSSBAUER EFFECT 313 


momentum is transferred to the crystal as a whole. * The mass of the latter is 
very large and the emission of a y-quantum takes place practically without a 
recoil. Gamma-rays emitted by nuclei in a crystal lattice have an unshifted 
frequency vo. When y-rays emitted by a crystal emitter without recoil are 
passed through an absorber containing the same nuclei as the emitter a reso- 
nance absorption is observed. 

Although the emission of y-quanta takes place without a recoil and the 
momentum of the crystal can be assumed to be equal to zero, the presence of 
thermal motion leads to a small change in the frequency. Namely, since the 
velocity of the emitter — an atomic nucleus performing thermal motion — 
differs from zero, a Doppler effect must occur. 

To calculate the change in the frequency use can be made of the following. 

If the nucleus emits radiation of frequency vo, then the energy of the 
y-quantum is equal to hvg. The mass of the y-quantum is equal to Avg/c?. As 
a result of the emission the mass of the nucleus is reduced by this amount. 
Let the mass of the nucleus before emission be equal to M. Then its kinetic 


energy is 
= 4 5 
BO =y pe? +M2c4 -Me = p2/2M s 
In emission the mass of the nucleus is reduced by an amount hvo/c? to 


hv 
M'=m-—2 
c2 
for a constant value of the momentum. Consequently the kinetic energy 
changes by an amount 
AE yin = (V p?c? + M'2c4 — M'c?) — EO ~ 
2 
p? p? a p^ hvo 
2M' 2M 2M2 c2 





Replacing the true value p? hy the mean value M22 for thermal motion, we 
obtain 


* At least at a sufficiently low temperature. For more details about the conditions 
under which the momentum is transferred to the crystal as a whole see the articles: 
R.MOssbauer, Soviet Physics Uspekhi 3 (1961) 866; and F.L.Shapiro, Soviet Physics 
Uspekhi 3 (1961) 881. 


314 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


AE h v? 
- ~œ hvg —.. 
kin 05,2 

The increase in the kinetic energy of the moving emitting nucleus means 
that the y-quantum takes away an energy hvo — AE, in = Avg (1 —v2/2c?). 
This energy is smaller by an amount AEķin than the energy of the y-quantum 
emitted by the source at rest. 

Thus, the thermal motion of emitting nuclei leads to a frequency shift 
from vg to v such that 


v=Vo h: - =) A 
2c? 

This expression for the shifted frequency ıs the same as the formula for the 
second order Doppler effect (takinginto account terms of the order of v2/c2). 
The absence of the linear term in the formula for the Doppler effect is asso- 
ciated with the fact that the mean velocity of thermal motion is equal to 
Zero. jad 

The calculated shift of the emitted line by an amount of vov2/2c2 is in 
good agreement with experimental data. 

Since the mean square velocity of a molecule in a crystal depends on the 
temperature, one can find the temperature dependence of the value of the 
shift v — vo: 


dAw wo d Mv? 


dm | Medr 2 





But in a crystal the heat capacity at a temperature substantially higher than 
the Debye temperature is equal to (see (30.1) of Part III): 





dem. id (= TE yee 


N 2 2 ak, ANT one 


Cy arama 


hence 


d Aw wo 

— = — —— C 

dT 2Mc?N 
The accuracy of the measurements using the Mössbauer effect is so high that 
this effect has also been measured. We cannot dwell here on a number of 


Vv 


IK S 
{4% §22 DIRECTION OF RADIATION 315 





È shall now consider the formulae for the transformation of some quan- 
tities which play important roles in physics. We begin with the consideration 
of the radiation emitted by a particle moving with a velocity close to the 
velocity of light. 

We assume that in the reference frame moving with the.particle the emis- 
sion can take place at a large angle to the direction of motion, i.e. cos 0’ 
varies in the range 


O<cos@’<1. 
Let us find the angle at which the radiation will be emitted in the motionless 


(laboratory) reference frame. 
It is convenient to transform with formula (22.3). Setting 


where 








For v close to c the quantity a? >> 1. Introducing this notion, we obtain for 


a> 1: 
cos’ + 1-4 = aaa 
+ 
ee are CRET ies a (1 cos ee 
1+ (1-4) cos.’ pee a 
a? a? (1 + cos 6’) 
cos@’—1 1 tan? 49’ 
cos @’ +1 a2 a? 


* See F.L.Shapiro, Soviet Physics Uspekhi 3 (1961) §81; and also the articles by 
R.MOssbauer and R.V.Pound in the same volume. 


316 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


Since a2> 1,cos6@ is close to unity and the angle @ itself is given approximate- 


ly bv 
r v? 
~ tan 36 Vl1l--—, 


V2 tan 50" 
2 Se 
c2 


a 
and is very small for a sufficiently large value of a, i.e. for a small value of the 
difference c—v. This means that in the reference frame at rest the overall 
radiation is directed forward and is confined to a narrow cone of angle 
A0 ~y 1 —v2/c2. The aperture of this cone is smaller the closer the velocity 


v of the source is to the velocity of light. 
We now obtain the formula for the transtormation of an element of solid 
angle d9, which is important for many applications. We have, by definition, 


dQ = 27 sin 0 d@ =—2ndcosé, 


dQ'’=—2ndcos@’. 
Hence 


_ dcosé 
d cos 0’ 


dQ dQ’. (22.10) 


From formula (22.5) we find 


2 
1 +2 coso'= (1-2) ——. (22.11) 
E c = cos 0 
c 


Differentiating (22.11) with respect to cos 6, we find 


y2 


d cos 6’ _ c? 
2 
dicoo h -7 cos o) 


thus we finally obtain 


2 

na 

, c2 
dQ = i ET dQ. (22.12) 


§22 TRANSFORMATION OF INTENSITIES 317 


Let us find how to transform the energy in an electromagnetic wave. From 
the general relation (13.9) and formula (33.12) of Part I we have 








E'+ = (mv) 1+ = cos 6’ 
E= =E' (22.13) 
2 
VEE face 
c2 c? 


By virtue of the relation (27.10) of Part I the invariant of the energy- 
momentum 4-vector for the electromagnetic field is equal to zero: 


by 
SS) 


invariant = g? = 


| 
i 
fo) 


(22.14) 


N 


{pA 


We note that, writing down the formulae for the energy and momentum of 
the field, which were determined-in Part I as the densities of the correspond- 
ing quantities, they are related to unit volume moving with the field, i.e. with 
the velocity of light. We should therefore define more precisely what such a 
volume means in the theory of relativity. * 

Comparing (22.13) with formula (22.1) for the transformation of fre- 
quency, we arrive at the important equality: 


ay 


a ; = invar . (22.15) 
w 


€ 


We now find the expression for the transformation of the total power of 
the radiation: 


Since the total momentum emitted by the system in that reference frame 
in which it was at rest at the moment of emission is equal to zero (see (28.5) 
of Part I), one can write 


* This transformation can be found, for instance, in the book: W.Pauli, Theory of 
relativity (Pergamon, London, 1958), or in the repeatedly quoted -book by R.Becker. 


318 RELATIVISTIC ELECTRODYNAMICS Ch. 3 








ee 
az Ge c? _ dé 
dt dt v2 v dt’ dt’ 
l- —— 
c? c? 
and, consequently, 
p=— EOE invar. (22.16) 
dt dt 


Thus, the total power radiated (the energy emitted per unit time) is an invari- 
ant. The radiated power in a solid angle element dQ transforms according to 
the formula 


dI = 1(6) dQ =1'(6) dO = invar , 


from which it follows that 
2 
LS 
1(0)=1'(0') = AO aa (22.17) 


2 
(: — cos a) 
c 


Combining formulae (22.4) and (22.12) we also obtain the expression 


w? dQ = w'? dQ’ = invar (22.18) 


In practice one often has to consider the law of transformation of the 
quantities dk, , dk,,, dk,. From (22.2) we find 


er a + ez Ae og m a e e a a 


§22 TRANSFORMATION OF WAVENUMBERS 319 


n EVSA - v dw’ 
Gi. ap lo (: ee w 

see roe i c? dk, 
dk 


= = 
Jie Je 
c2? c? 


e . pa DAET Y Saray p A 
Taking into account that w = cvy kK? + ky + k“, we have 











du’ kc? 
dk; a 
so that 
uk! 
ie cee enema 
w w tu dk 
CS ed SS Sa Ss Say, 
ate Oy fe ae 
c2 c2 
In addition 
dk, = dk, > dk =dk, 
whence we find 
dk,dk,dk, dk. dk’ dk’ 
x 2 = YY = invar , (22.19) 
w w 
or, in spherical coordinates, 
2 2 + i] 
Kdk PAR AY L nyar, (22.20) 


Ww a) 


An important problem in the contemporary physics of high-energy par- 
ticles is that of transformation of the total scattering cross section in the 
transition from one inertial reference frame to another. 

In §43 of Part I we have seen that the total cross section in classical phys- 
ics is an invariant. In the theory of relativity the cross section in general is not 
an invariant. 

According to (43.2) and (43.4) of Part I the total number of particles scat- 


a ae SE E 


320 RELATIVISTIC ELECTRODYNAMICS Ch. 


tered by a volume V in a time dr in the reference frame K in which the scat- 
terer is at rest is equal to 


dN=ol)pV dt, 
where p is the particle density of the scatterer, and 


Ig =nlv,—v,! = NV el? 


and n is the density of particles in the beam to be scattered, v> is the velocity 
of particles to be scattered, and v} is the velocity of the particles of the tar- 
get, equal to zero in the reference frame K. 

In a reference frame K’ let the target- move with a velocity v;. Then in 
view of the invariance of the number of particles dV we have 


dN = ol) pV dt =0'V "Igp dt’, 
where 


iene i 
Ig=n lv,—v,!. 


Hence it follows that 
onp |v, —v,! Vdt=o'n'p'lv,—v}| V' dr. 


From formulae (6.2) and (6.3) the product V dż is an invariant. The den- 
sity of the incident beam of particles n and the density of the particles of the 
target p satisfy the continuity equation. In accordance with this, we can con- 
struct the 4-vectors (pv,, icp) and (ny3, icn). We form their scalar product, 
which is an invariant. We have, obviously 


(pv,, icp): (ny3, icn) = n'p'V,°V} —c*p'n'=—cpn, 


since in the reference frame K the target is at rest and v} = 0. 
On transforming we have 


Ph eee 2 
pn |1— 5 )=on. 
c 





Thus, we can write that 





S R > a 





§22 TRANSFORMATION OF CROSS SECTIONS 321 
Brennen)? 6-82 o'l¥ax Va! 
2 1 rel vi r v% 
c2 


Since v} = 0, the last formula can be written in symmetric form: 


a A 
olvy—vy! ø lv3-—v]! 





Bhat? Tej vi y2 


2 D 








We see that the total scattering cross section is not an invariant. 

In the particular case when the velocities of the incident particles and the 
target are in the same direction or in opposite directions we can choose the 
direction of motion as an axis and make use of the formula for the addition 
of velocities. According to the inverse of (7.1) we have 


v2 — Yh 
V5 Bas % i 
Riese? 
Thus, in this particular case 

o=o =invar. (22.22) 


A question of interest is, what is the form in which rapidly moving objects 
appear when they are recorded on a photographic plate or when they are ob- 
served visually? In other words, will the Lorentz contraction be noticeable? 
For example, will a sphere be seen in the form of an ellipsoid, and a cube in 
the form of a parallelepiped, and so on? The corresponding investigation has 
recently been carried out by Terrel *. It turns out that rapidly moving objects 
are seen not as flattened but as having undergone a rotation through some 
angle depending on the velocity of the object with respect to the observer and 
the angle of observation. In order to understand this, at first sight unex- 
pected, result, we shall define more precisely the difference between the 
results of a measurement of the form of a rapidly moving body and the spe- 
cial case of recording this form on a photographic plate. In general, to meas- 
ure the form of a body, for example by recording the radiation appearing 


* J.Terrel, Phys. Rev. 116 (1950) 1041. See also V.Weisskopf, Physics Today 13 
(1960) 24. 





322 RELATIVISTIC ELECTRODYNAMICS Ch333 











c 


TNN k » 


4 


Q 
Fig. II.6 


simultaneously from different points of its surface, we have to take into 
account the finite value of the velocity of light. 

Different points of the object of observation — a moving body — are 
located at different distances from the recording device. Hence electromag- 
netic waves emitted by different points of the surface of the body traverse 
different paths for different times before arriving at the device. To obtain the 
true form of the object of observation it is necessary to introduce correspond- 
ing corrections into the measurement data. 

The situation is different in photographing or observing a body visually. 
A photographic plate (or an eye) records the radiation reaching it at a given 
instant. Consequently, a photographic plate simultaneously records electro- 
magnetic waves emitted by different points of the object of observation at 
different times. Let a photographic plate œa be at rest (fig. II.6) in a certain 
reference frame. We assume that an extended body is moving in a direction 
towards it. We denote by v the velocity of the body (and the reference frame 
K' connected with it). The angle 9 between a straight line drawn from the 
photographic plate to the body and the velocity vector v will be called the 
observation angle. 

The radiation from the body canbe characterized by a wave vector k’ in 
the reference frame K’ connected with the body. The observation angle in the 
reference frame K’ will be equal to 9’. The angle 6’, formed by the wave vec- 
tor k’ with the velocity v is obviously equal to 6’ = 7 — 9'. 

In the reference frame K the wave vector forms an angle 0 =1—¥8 with 
the vector v. We assume that the solid angle subtended by the object of obser- 
vation is sufficiently small. The radiation emitted by different points of the 
object can then be characterized by one value of the wave vector k. The 
image of the body on the photographic plate obtained at a certain instant will 
be called “the picture”. Let us see how the picture is obtained, i.e. how elec- 


GSS mec q ¢ ¢ 4 dm nas "FF 





§22 RAPIDLY MOVING BODIES 323 


tromagnetic waves emitted by different points A, B, C of the surface of the 
body are recorded (see fig. I1.6). The waves emitted by the points A and B at 
the instant when they were respectively in the positions A’ and B’ reach the 
plate at the same time as the wave emitted by the point C on the surface of 
the body. In other words, radiation emitted by the points A, B, C is recorded 
simultaneously on the photographic plate at the points A”, B”, C”. However, 
the waves coming from the more distant points A and B were emitted at 
earlier times. Let us find the relation between the observation angles 3 and 3’. 

Taking into account the above mentioned relation between the observa- 
tion angles 3, 9’ and the emission angles 8, 6’ and formula (22.5), we can 
write 





2 
MEL wo 
eee 
v cos 8 

c 


sin 9 = (22.23) 


1+ 


In formula (22.5) we have replaced the angles 0 and 6’ by m — 9, n — 0’. The 
dependence 9’ = f() is presented graphically in fig. II.7. The dotted line in 
the same drawing shows the straight line 8’ = V, corresponding to the case 
v=0, i.e. to the recording on the photographic plate of an object which is at 
rest with respect to it. Consider first the case 8 = m. This means that the body 
to be photographed is moving directly towards the photographic plate. In this 
case 9’ = n, i.e. we obtain the image of the front surface of the body on the 
photographic plate. 


9! 
T 








324 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


Now let the angle 9 have a value smaller than 7 — y 1 — v2/c?. This means 
that the observation angle lies outside a narrow cone about the direction of 


motion. 

From formula (22.23) and the graph in fig. II.7 it is seen that for v =c 
there correspond to the values of the observation angle 5 nL? 
<7 — y1 —v?/c2 values of the angle 3’ close to zero. To angles 9’ close to 
zero there correspond angles 0’ close to m. This means that at velocities of 
motion of the body close to the velocity of light the radiation from the back 
of the body arrives at the photographic plate. 

It should be stressed that this result refers to a body moving in a direction 
towards the photographing device (3 > 31). In this case the intensity ob- 
served can be found from formula (22.17). Namely, we have 
ee 


c 


Fae ae 
(: ny cos 9) 
c 


If the angle 9 <—./1-— v2/c2, then the observed intensity of light is small. 
This result has a simple meaning: for v~c in the reference frame K the 
radiation is concentrated in a narrow cone with an aperture A0 ~ ./ 1 —v2/c2 
in the direction of the velocity of the body. 

In general it can be said that if the object to be photographed moves with 
a velocity v and is observed at a certain angle 3, then one obtains the same 
picture as in photographing the object at rest but seen at an angle 9’. Thus, 
the image of a moving object on a photographic plate turns out to be the 
same as on a plate which is at rest with respect to the object. However, the 
object turns out to have undergone a rotation through an angle 9’ — 9. Thus, 
in photographing a sphere one must obtain a circle on a photographic plate, 
in photographing a cube one must obtain a cube turned at a different angle, 
and so on. 

Fig. 11.8 shows schematically how the image of a cube on a photographic 
plate must change as the relative velocity approaches the velocity of light. To 
the points A, B, C, D of the cube there correspond the points A’, B’,C’,D’ on 
the photographic plate. We see how the image progressively “‘rotates”’. 

These considerations make particularly clear the lack of meaning of 
the terminology “an apparent contraction” in relation to the Lorentz con- 


PIO ay, 
I) = 13(9') == =1'(9') 
of 0 dQ of 


traction. 
It should be noted that the entire picture becomes more complex if the 


object observed subtends a large solid angle. In this case to each point of the 


§23 THE LORENTZ FORCE 325 
D ! 
l 
c v=0 
E: 
A 
B 
a 
D 
vec 
G 
E 
A 
4 y 
B 
x 

D 

vae 
S 
A A 
B 
oi 
Fig. II.8 


object there corresponds a proper observation angle 3 and, consequently, also 
a proper rotation. However, in this case also the image of a sphere on a photo- 
graphic plate must be a circle. 





§23. The Lorentz force; the Lagrangian and the Hamiltonian for a particle 
moving in an electromagnetic field 


In electrodynamics an expression was given for the force acting on a charge 
- moving in an electromagnetic field. This force, called the Lorentz force, was 
d deduced from experimental data. We shall now present a simple derivation of 





326 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


the formula for the Lorentz force, based on the transformations of the elec- 
tromagnetic field vectors. 

Consider a charge e, moving with an arbitrary velocity v with respect to 
the reference frame K. In the reference frame moving with the charge let 
there be an electric field E’. Then the charge in the reference frame K’ is 
acted upon by the force 


F' = cE’ (23.1) 


Our problem is to find the force acting on the charge in the reference 
frame K (in which it moves with the velocity v). For this it is necessary to 
express the field E’ in terms of the electromagnetic field E, and the force F’ 
in terms of the force ¥ in the reference frame K. The first transformation is 
given directly by the formulae of §21. 

The formulae for the transformation of the force can be obtained in the 
following way. We find, first of all, the formulae for the transformation of 
the components of the Minkowski force (§12). The Minkowski force is a 
4-vector, and its components transform according to the general formulae 
(11.1) — (11.4). In our case, in the reference frame moving with the charge 
the velocity v’ = 0 and, hence, F4 = 0. In this case 


Fi = a 
F =———, F =F}, Fy=F 


: 2 
2 
fies 


c2 


The components of the Minkowski force are connected with the components 
of the ordinary force F by relations of tke type (12.5). 
In the reference frame moving with the charge v' = 0, and 


(a G2), Tag EF 


1 0 





§23 THE LORENTZ FORCE 327 


Hence we find easily * 


, , v2 ei v2 
Fo Fe FERVE F EFI B (23.2) 


Combining (23.1), (23.2) and the transformation formulae (21.7)—(21.8), we 
find the expression for the components of the force: 





F EF e EEN (23.3) 
D vH, 
SANS EV 1 =r sez,- =) (23.4) 
c 
2 2 vH. 
F ae treme (etal (23.5) 
c c c 


In vector form 
go e[E+tvxu} (23.6) 


Thus, the expression for the Lorentz force is obtained in a purely mathe- 
matical way from the general relations of the theory of relativity. 

Let us write the equations of motion (12.3) for the case of the motion of a 
charged particle in an external electromagnetic field. 

The first three equations are of the form 


dp _ 1 i 
T -e[E+tvx n| . (23.7) 


In finding the fourth equation it must be taken into account that the work 
done by the force in the magnetic field is equal to zero (since v - (v X H) = 
= H-(v Xy) = 0) and, consequently 

oo =F -v=eE-v. (23.8) 


In the following paragraphs we shall consider some particular cases of the 


“It should be stressed that the formulae presented are valid only for the transition 
from the reference frame K’ moving together with the particle to the reference frame K. 





328 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


motion of a particle in an electric and magnetic field. For what follows we 
shall need an expression for the Lorentz force in terms of electromagnetic 


potentials. 
We have 


z i) OR Gee 
F=e BING ing tg VX CY XA) 


We make use of the formula (1.47) of vector analysis: 
V(A-v) =(A-V)v +(v-V)A+vX(VXA)+ AX(VXv) = 
=(v-V)A+VvX(VXA). 
Here we have taken into account that the differentiation with respect to coor- 


dinates is carried out at a constant value of the velocity v. 
Making use of this formula, we rewrite F in the form 


3 10A,1 3 1 k aed 
F=e|-Vo-t ae Dee A Via V)A 
-ev (£2 - )-2%. 
c c dt 


where the total derivative (see I.18) is 


dA _ dA 


aE = ap VVA. 


The equations of motion assume the form 





Se ae AY} £38 (23.9) 


d v 
ar / u2 c cdt 
jes 
z 


These can be considered the Lagrange equations, if the Lagrangian is of the 


form 
2 h v2 e 
L=-mc e —ept—A-vy. (23.10) 
c c 


o e eee e —— ee a T ee a 


§23 THE LORENTZ FORCE 329 


Indeed, in this case the generalized momentum is 


p= = +2 A=ptt a. (23.11) 


Q 

< 

a 
|% 
i>) 


d ob _ ob 
dt ov or’ 
or 
d p-Q (23.12) 
T l i 


The substitution of the value P and Q into (23.12) again leads us to (23.9). 
In the non-relativistic approximation the Lagrangian assumes the form 


2 2 
L~—mc? (1 -3) +E A- v—ep= — ep +2 Avy. (23.13) 
2c2/ c 2 G 


Here we have dropped the constant (—mc?), since Lagrange’s equations in- 
clude only the derivatives of L, and L itself is significant only with regard to 
its total derivative with respect to time. * 

Comparing the Lagrangian of a particle in an electromagnetic field with the 
expression for the Lagrangian in an ordinary field of force, 


=- mv? 


ka 


— U, 


we see that for the electromagnetic field the Lagrangian contains one more 
term, depending on the velocity and vector potential. Hence even in the non- 
relativistic approximation the Lagrangian of the electromagnetic field cannot 


* L.D.Landau and E.M.Lifshitz, Mechanics (Pergamon, Oxford, 1960). 








330 RELATIVISTIC ELECTRODYNAMICS Ch 


be written in the form of the difference between the kinetic and potential 


energy. 


Let us find the Hamiltonian of a charged particle in an electromagnetic 
field. Obviously, we have 


D qP;-L=v:P-L=v- (ee +£ A) + 


G] C 
i=x,y,Z De 
Sane] 
Ep 
2 2 
i  ¥ e 
+ mce2./1—— + ep—- A- v= E +e. (23.14) 
c? C u2 
1- 
c2 


In order to obtain the Hamiltonian the velocity v should be expressed in terms 
of the generalized momentum P. This is done most simply by means of the 
equations 


mae (23.15) 
v 
tae 
c2 
252 2 
zri (p-£ A) ; (23.16) 
[= c 
c2 


which are obtained by raising (23.14) and (23.11) to the second power. Re- 
writing (23.16) in the form 


m?v? 








2,2 2 
v mc 
5 m?c? — m?c? = -mc = (e me A) $ 
v v2 c 
peA ee 
c? c? 


and comparing with (23.15), we obtain 


(H — ey)? = m2c4 + Ca A)? c2, 


or 





2 
Book c sa (p Ta) c? +eg. (23.17) 


e e eaea ae e r TO = aa SS se. 


§ 24 MOTION OF PARTICLES IN CONSTANT FIELDS 331 f 


As was to be expected from general considerations, the Hamiltonian in an 
' electromagnetic field is, essentially, the same as the Hamiltonian in an electro- 
static field: the magnetic field does not change the energy of the particle. 

In the non-relativistic approximation we obtain from (23.17) 


2 2 
(lie | 
2 Í 
SN e aaO 

m?c? 2m | 


2 


H = mc 1+ 


(23.18) 


The last expression, if the rest energy is not taken into account, is the same as 
the classical expression for the Hamiltonian of a particle in an electromagnetic 
field (see (41.4) of Part 1). 


§ 24. The motion of particles in constant electric and magnetic fields 


The simplest case of the motion of relativistic particles is their motion in 
constant electric and magnetic fields. At the same time, the motion of 
charged particles in electric and magnetic fields is of very great practical 
interest. It is sufficient to quote some examples: the investigation of the 
motion of electrons in such fields made it possible to test to a high degree of 
accuracy the relativistic expression for momentum; the relativistic formulae 
determining the law of motion of particles in electric and magnetic fields 
represent the basis for the design of contemporary nuclear particle accelera- 
tors, allowing one to obtain particles of relativistic energies. The investigation 
of the motion of very fast particles in a Wilson cloud chamber placed in a 
magnetic field allows the determination of their momentum. The combina- 
tion of the measurements of the momentum and the energy (carried out, for 
example, on the basis of the ionization produced by the particles) makes it Í 
possible to determine the mass of the particles. 

Consider, first of all, the motion of a relativistic particle in an electric field 
E which is constant in time and uniform in space. We choose the direction of 
the vector E to be the x-axis. According to § 12, the equations of motion are 
of the form 





332 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


As the first example we shall consider the motion of a charge in a trans- 
verse electric field. At the time t=O let the charge be at the point x = 0 and 
have a momentum p, = 0, Py = Po» Pz = 0 and energy 


E =V pac? +m?ct . 
This means that at the initial instant the charge was moving in a direction 
perpendicular to the field. The integration of the equations for the compo- 
nents of the momentum in the constant electric field is carried out directly 
and, taking into account the initial conditions, gives 

Py “eE t, Py =Po> p,=0. (24.1) 
The equation for the energy is-also integrated directly: 


E=eE x +t Ep- 


On the other hand, 





B= JGR tpt PDE tme = 





= [(eE, 1)? + pale? +m*c4 = J (eck, t)? +E? ; 


Comparing the two expressions for the energy, we can write 


V (eck, t)* + £2 = eE x + Ey. (24.2) 


Formulae (24.1) show that the motion is planar and takes place in the 
(xy)-plane. In order to find the trajectory one can write the relation 


eK a a a 











§24 MOTION OF PARTICLES IN CONSTANT FIELDS 333 
Px _ mv, Vi-v%/c2 _ vx _ dx (243) 
Py V1—v/c2 mv, vy oy . 
Substituting the value of p,/p, from (24.1) into (24.3), we obtain 
dx eE,t 
=== (24.4) 
dy Po 


To eliminate the time we make use of formula (24.2), which gives 


V (eE,x + Eo)? -E3 


ecE. 


x 


t= 


Substituting this expression into (24.4), we arrive at the differential equation 
for the trajectory: 


dx /(eE,x +E)? - £3 


dy CPo 
Integration gives the equation of the trajectory 


‘aes dx th eE,x 


a = [— - — arccosh 


cpo ` v (eE,x +E)? -Ef eE, Eo 








or 
E cE. Y. 
x= a cosh — (24.5) 
eE cPo 


Eq. (24.5) shows that in the transverse electric field the charge moves on a 
catenary. For v << c we can write 





r 2 mc? EY Ante 
omc", po™mvg and * iene So (24.6) 


If the argument of the hyperbolic cosine, containing c in the denominator, is 
small, then, expanding (24.6) into a series we arrive at the classical formula 
for the trajectory, which we have already found earlier (see §39 of Part 1). 

The second important case is the motion of a charge in the direction of the 
electric field, i.e. the case of the initial conditions 


334 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


=; 0) — 2 
p =0, py =0), p =0, EO = me? , 


In this case the field will be considered to be constant in time, but varying 
arbitrarily in space along the x-axis. 
The integration of the equations for the components of the momentum 


gives 


Pp =eE,(x)t, [a= Os FAS (24.7) 


We write the equation for the energy in the form 


dE_ dydx__ dy 


u A di ° dt” 


where y is the electrostatic potential. Whence we find the integral of the 


energy: 
E + ey = const = mc? + epg » 
or 
2 1 2 
AE = mc“ | ——— - 1| =eV, (24.8) 
v2 
ee 
c2 


where V is the difference of potential through which the particle has passed, 
and AE is the corresponding change in the energy. The velocity of the par- 
ticle which has passed through the difference of potential V is equal to 


(24.9) 





This expression is used in practice for the calculation of the velocity of a 
particle accelerated by an electric field (fig. II.9). 

For eV/mc* << 1 formula (24.9) reduces to the classical expression (39.1) 
of Part I. On the contrary, for eV/mc* >> 1 the velocity of the particle 
tends to a constant limit v > ¢ as the potential increases. 

If the field is uniform in space, then the dependence of the velocity and 
coordinate on time is obtained directly from (24.7). Namely, from (24.7) 
we have 


a 


2 SL a a S S A —— ————————————— S 


Teee e R TA Te 

















§24 MOTION OF PARTICLES IN CONSTANT FIELDS 335 J 
o5 10 
A m=1840 Me 0.8 M = Me 
0.3 a6 f 

v y. 

i | ic 
0.2 04 | 
mall 02 

o 40.20.3040 0.0 10 20 30 40 
Mev Mev 
Fig. 11.9 


(=) i 
Dl == 
mc | 


EENAA (24.10) 


mc 


v= 





Formula (24.10) gives the law of uniformly-accelerated motion in relati- 
vistic mechanics. A uniformly accelerated motion in the theory of relativity is 
understood to be the following: Let us introduce a number of inertial refer- 
ence frames Ki, K%,K3,..., moving with velocities equal to those of the 
motion of the particle at different instants. Each of the reference frames K’ is 
called a frame instantaneously accompanying the particle. In such a refer- 
ence frame accompanying the particle its velocity is equal to zero for an 
instant. 

Hence, according to (11.12) and (11.16), the components of the 4-accel- 
eration in K’ are equal to 


eE; 
w. =v =v=— w'=0 
x x m Q T n 


By virtue of (11.1) and (11.13), we have in a reference frame at rest 











336 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


Whence we find the quantity ù, which we shall call the acceleration: 


eE. v2 3/2 
AE (1-7) z (24.11) 
m c2 


Integrating (24.11) we again arrive at formula (24.10). Integrating (24.10) we 
obtain the law for motion 


mc? (= J 
= = —— H= 
x-x0 "E 1 mea 4 (24.12) 


x 


This is the equation of a hyperbola. Hence in relativistic mechanics the mo- 
tion in a constant field is often called hyperbolic, as distinct from the para- 
bolic trajectory of classical mechanics. 

Finally, consider the motion of a particle in a constant and uniform mag- 
netic field H. We choose the direction of the latter to be the z-axis. Then the 
equations of motion take the form 

dp, e dp, e dp, dE 


linea dp scl: ap S ap 9° n 


Integrating the equation for energy, we have 
E= const = Ep - 


By means of (13.2) we obtain 


dp, d £ ) Eo dv, dp, Eo dv, 
= S Wp) S ES 
c2 c? dt dt c2? dt 


dt dt 


Hence the equations for the x and y components of the momentum can be 
written in the form 


1 dv, ecH 
D a i 
(24.14) 
iL Wyau EH 
Vy dt Eo : 


Eqs. (24.14) can be satisfied by setting 


§24 MOTION OF PARTICLES IN CONSTANT FIELDS 337 
v, =a cos (wt +a), vy = —asin (wt +a). (24.15) 


For w we find 


ecH 
e 24.1 
Ores (24.16) 
From (24.15) 
v2 + v= a? = (vy)? = (vy) = const . 


The quantity (v,)9, which represents the initial velocity of motion of a par- 
ticle in the (xy)-plane, remains constant in time. 

The second integration leads to the equation of the trajectory in the (xy)- 
plane 


@Do . (oo 
xX=Xq + sin (wr +a), Y=Vot A 





cos (wt +a). 


The particle moves with a constant velocity along the z-axis, as is seen from 
(24.13): 


z= Zo + (v,)ot. 


If, in particular, at the initial instant the charge had no velocity along the 
z-axis, i.e. (v,)o = 0, then its trajectory represents a circle in the plane (xy) 
the radius of which is equal to 


R= <a OEE GER? (24.17) 


where pọ is the initial momentum. 

The frequency w of rotation in the circle is proportional to the strength of 
the magnetic field and depends on the energy Eg which remains constant 
during the motion. At small energies 
2 


Ey ~ me and weH/mc, (24.18) 


i.e. w reduces to the cyclotron frequency determined in §39 of Part 1, which 
does not depend on the energy of the particle. 





338 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


Formulae (24.16) and (24.17) are of very great importance in contempo- 
rary nuclear physics and technology. 

The measurement of the deflection of particles in a magnetic field allows 
one to find their momentum by means of formula (24.17). 

Formula (24.16) is the basis for the design of contemporary cyclotrons, 
which allow one to obtain heavy particles (protons, deuterons and a-particles) 
with relativistic velocities. 

As is well known, in the cyclotron particles move in circles in a magnetic 
field between two dees to which an alternating voltage is applied. Particles 
pass through the gap between the dees in an accelerating electric field. Thus, 
after describing a semicircle with a constant velocity the particle is acceler- 
ated and describes the next semicircle with a new value of the velocity, and so 
on. For continuous acceleration the field in the accelerating gap must be ina 
definite phase at the instant when the particle enters it. So long as the par- 
ticles do not acquire relativistic velocities the frequency of the electric field 
applied to the dees is determined-by formula (24.18) and does not depend 
on the energy of the particle. If, however, the velocity of the particles attains 
relativistic values, then, according to (24.16), the frequency of their rotation 
turns out to vary with energy. Hence, for a further acceleration of the par- 
ticles, it is necessary to change the frequency of the accelerating field in 
correspondence with formula (24.16). Cyclotrons operating in this varying 
frequency mode are called relativistic cyclotrons or synchrotrons. 

The design of synchrotrons was first suggested in 1944 by V.I. Veksler, who 
showed that owing to the particular properties of the motion of charged 
particles (the so-called phase stability), the synchrotron accelerates particles 
arriving at the accelerator chamber with different initial phases of motion. 

The acceleration of light particles — e.g. electrons — takes place in the 
relativistic mode at relatively small energies. 

One of the most important types of accelerators is the induction acceler- 
ator or the betatron. In the betatron electrons move in a magnetic field with 
axial symmetry between the poles ‘of an electromagnet. If the magnetic field 
were constant in time, then the motion of an electron with constant velocity 
would take place on a circle of radius R given by formula (24.17). In.the beta- 
tron, however, the strength of the magnetic field varies in time. For concrete- 
ness we shall assume that the strength of the magnetic field increases in time. 
This variation of the strength of the magnetic field entails: 

1) The appearance of an induced electric field with a strength E deter- 
_ mined by the formula 


§24 MOTION OF PARTICLES IN CONSTANT FIELDS 
tind 
fea=—~2 fuas, 
or 
1 dọ 


By ~ 2nRe dt’ 








where ® = f H- dS is the flux of induction through the area of a circular orbit 
of radius R. It is clear that, because of the axial symmetry of the magnetic 
field, the electric field is directed along a tangent to the circle of radius R. 
Consequently, the electron will be acted upon by a force (—eE.,), which is 
also directed along a tangent to this circle. According to (24.17), the increase 
in the momentum of the electron will correspond to an increase in the radius 
of the orbit, i.e. to a tendency of the electron to move outwards on a spiral. 

2) A decrease in the radius of the circle in (24.17) as the strength H in- 
creases. This corresponds to a tendency of the electron to move inwards on a 
spiral. 

If the rate of change of Hin time (and, as will be seen from what follows, in 
space) is chosen in such a way that the two tendencies exactly compensate 
each other, the electron will’ move on a circle of constant radius R with 
increasing momentum. This is the so-called betatron mode of the motion of a 
particle. 

We now consider the conditions under which such a mode will occur. 

The change of the momentum of the electron can be written in the form 


dp _ e dẹ 


dt eE oT ImcR dt’ (049) 


If it is assumed that the electron moves on a circle of constant radius, then on 
integrating (24.19) we obtain 


ELEN 
DT 


The integration constant is set equal to zero under the assumption that the 
conditions H = 0 and v= 0 hold at the' instant ¢ = 0. 


If p is substituted into (24.17), then the condition of constancy of the 
radius of the orbit in time can be written in the form 


SE att De jae” (24.20) 


340 RELATIVISTIC ELECTRODYNAMICS Ch. 3 
The flux of magnetic induction through the area of the orbit is equal to 
® = 1R2H 4 


where H is the mean field inside the orbit. 
Then the condition (24.20) reduces to the equality 
Hop = lel 

This indicates that for the motion of an accelerated electron on a circular 
orbit it is necessary to produce a magnetic field not only varying in time but 
also non-uniform in space. The field on the orbit must be equal to one half 
of the mean strength inside the orbit. For this requirement to be fulfilled the 
field must decrease with increasing radius r. It is clear that the acceleration of 
particles in the betatron is of intermittent character — it takes place only when 
the magnetic field increases in time. We cannot consider here the problems of 
the stability of the motion of particles in a betatron orbit and the details of 
the design of actual accelerators. * 


§25. A system of weakly interacting charged particles 


We can now return to a system of particles in relativistic mechanics and 
consider the case mentioned in §15: that of particles connected by the elec- 
tromagnetic interaction. ** 

We have alréady shown that a potential energy of interaction cannot be 
introduced for a system of interacting particles, since the retarded interaction 
depends not on the relative position of the particles at a given instant but on 
their motion during the preceding time. Moreover, in an accelerated motion 
charged particles emit radiation and a part of the energy leaves the system, so 
that the system as a whole is non-conservative. 

However, it turns out that if the motion of the particles is sufficiently 
slow, so that v<c, then to a certain approximation one can introduce the 


* See, for example: A.P.Grinberg, Metody uskoreniya zaryazhonnykh chastits (Meth- 
ods of accelerating charged particles) (Gostekhizdat, Moscow, 1960); M.S.Livingston and 
J.P.Blewett, Particle accelerators (McGraw-Hill, New York, 1962). 

** V.A.Foek, The theory of space, time and gravitation (Pergamon, London, 1964); 
L.D.Landau and E.M.Lifshit7, The classical theory of fields (Pergamon, London, 1962). 





§25 WEAKLY INTERACTING CHARGED PARTICLES 341 


notion of the interaction of a system of charges which depends only on their 
mutual separations. This allows one to characterize the state of a system of 
moving charges by means of mechanical quantities and to consider the motion 
of the system, according to the laws of mechanics without a direct connection 
with the state of the electromagnetic field. 

Indeed, if the charges are moving with small velocities, so that the retarda- 
tion can be completely disregarded, then their energy of interaction is ex- 
pressed by the formula of electrostatics: 


ie pas (25.1) 


r- 
i<k ik 


where r;, is the fixed distance between the charges i and k. For an arbitrarily 
chosen kth charge the Lagrangian can be written in the form 


Ly =} mvp — ek Pk > (25.2) 


where g% is the potential of the field acting on the charge k. 
The Lagrangian of a system of charges is obtained by a simple summation: 


L=}; L; =D) ($ mv? — epp) - (25.3) 
k 


Knowing the Lagrangian, one can find the equations of motion of the system. 

Further calculations will show also that in successive approximations of 
the expansion in powers of the ratio v/c, up to terms of the order of v?/c?, 
the Lagrangian of the system can be found. This will also allow us to carry 
out the programme of a purely mechanical description of a system of charges 
previously mentioned. 

At the same time it is clear that, if terms of the order of (v/c)? are not 
discarded, then such a description will, generally speaking, become impos- 
sible. Indeed, according to the results of §27 of Part I, the dipole radiation 
of a system is determined by a quantity of the order of 1/c. The retention 
of the higher terms of the expansion of potentials corresponds to taking 
account of the dipole radiation. This makes the mechanical approach to the 
treatment of the state of the system illegitimate. 

In systems in which there is no dipole radiation the expansion of the 
potentials can be carried out up to terms of a low order of magnitude as 
follows. We write the Lagrangian of the kth particle taking into account the 
motion of the charges in the system (see eq. 23.10)): 


342 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


2 up ek 
L=- m;e vi- 5 an e Ay) (25.4) 


where y; and A, are the potentials of the field at that point where the kth 
charge is located. These potentials, taking into account the finiteness of the 
velocity of propagation of interaction, can be written in the form of retarded 
potentials (see § 24 of Part I): 


(e Hawa) 
oli i=- 

c ' 
CO mee oY" 
r-r 





WE: lr -—r'l $ 
1 ilre- c ) 4 
Ato D= J X 


Ir, =r! 


For a slow motion (v << c) the charge density and current density can be 
expanded in a series of powers of the delay time. An important difference of 
this expansion from the analogous formulae of §26 of Part I lies in the fact 
that we are interested in potentials at a point rą located within the limits of 
the system of charges considered. Hence the total delay time cannot bé re- 
solved into the proper retardation and the retardation of the system, and the 
expansion must be carried out with respect to the total retardation: 


_ Ix,—rl 
Psy a 
c 


We then have 


(e T ‘ Ir,—r'l ð > 

Tepe NS ae pe ee + 
p\(r,t 3 p(r ,t) z Bp PED) 
Ir, —r'l2 92 


Toy? pr’, t) , 
c? ðr? 


+ 


Nir 


Ir, —r'l 
ifr- =) =j@',+ O(1/c). 


Terms Q(1/c) are neglected because of the factors 1/c in A(T% t) above and in I 
eG 425.4). Hence 


OO eee__a - — n — a 


§25 WEAKLY INTERACTING CHARGED PARTICLES 343 


p(t’, t) 1 rdp(t',t) 41,1 
v(r,,t) © CVA e BASELA y 
is ese lrg Zil ral or 


1 1, d2p(r', À , 
+— | Ir, -rl ~ dy'= 
a k 


ar2 
p(r’, t) ' 
mar: A p(r’, y+ lr; -r'| pr, Nady’. 
reser sal ) 2c2 5S k 


The order of the differentiation and integration can be changed, since the 
vector r% is fixed, and r’ is a set of three independent variables. 

Since the integral in the second term is taken at a time ¢, it represents the 
total charge of the system. Correspondingly, 


ð , OERS 
5; J or’, t) av = =0. 


Finally, we find 


ac MENAR? a, ; 
olny. = fav po ees lp(r’,t) dV’ (25.5) ` 


2c? 


A, DE Paar oa av’ (25.6) 
ome 


Further calculations become clearer if one considers voint charges. Let 


p(t’, 1) = Dye, 6(r! —r,(0)) - (25.7) 


The prime on ine summation sign indicates that in the sum the term i= k is ab- 
sent since we do not take into account the field due to the kth charge itself. 
In what follows we shall omit the prime on the sum so as not to encumber 
the formulae, but shall imply it in all summations over charges. 


Substituting the expression (25.7) for p(x’, £) into (25.5), we have 





344 KELATIVISTIC ELECTRODYNAMICS Ch. 3 


(t= 274; = 





Gi 
O me ln r,(O1= 


cae 


ci l 32 
F t i | {Ole (oe 
2 lrg —r;(ġ! 2c? Lie, eke i) 4 (25:8) 





We stress that as a result of integration (elimination of the 5-function) the 

radius vector of the ith charge 7;(t), depending explicitly on time, appears in- 

stead of 1’ in the corresponding expressions. Hence the result of integration 

with respect to the variables r’ must be differentiated with respect to time. 
Analogously, assuming 


j@',.) =D) e, v; 6-400), (25.9) 


we have. from (25.6) 
ING =D aT ce (25.10) 


Here y; and A; are the potentials produced at an instant ¢ at a point r, by the 
ith charge. 

The expression (25.8) for the scalar potential taking into account the 
retardation (the se¢ond term) contains the second derivative with respect to 
time of the vector r;, i.e. the acceleration of the ith particle producing a field 
at the point r,. However, only the coordinates and velocities of the particles 
can enter into the Lagrangian. Hence it is advisable to carry out the gauge 
transformation (see §11 of Part I) and to choose the function w in such a 
way that the second term will be absent in the scalar potential. That is to say, 
assuming 

Pyt aMi : 
i A A Ys 


A; > A,- V Y; 


where, using (I.12’), 


§25 WEAKLY INTERACTING CHARGED PARTICLES 345 
= Si Wj [= Gi Cy 
vi c or rk (0) 2c Vy Ite Tj ðt 


3 
aj Gers BD Voie aan (25.11) 
Ir, —r,l} ðt 2e Ir,—r,l ” ‘ 


we find from (25.5) 


== mi (95.12) 
i 


=r;l 


In differentiating with respect to time, the position of the observation point 

T, is fixed, the ‘gradient is directed from the charge i to the observation point, 

and v; denotes the velocity of motion of the ith charge at an instant £. 
Conmespondingly; for A’ we obtain 





; ei Vi ei (t-r) “Vi 
=A.+ = SSS SN SS SESS ` 
AEG Yri 2 clr,—r;! 2c Ir, —r;! Cee) 
1 
According to the formula 
(r-r; -a a (t,—1))-a 
T = = 3 (t,—-F;), 
k \e,—r;l  irg—r;l It, —r;l 
where a is a constant vector, we find 
(=r) Vi Yi COENA D 
Oo > [Ll kti). 
Tk Ir, —r;| Ir,—F;l lrg—r;l? 
Hence for the total vector potential we obtain 
e; -r V; 
A'(r;, t) = DME D mau +— = tet) 
i er 2c Ir,—r;l 
(25.14) 


The expression (25.12) and (25.14) which have been. found for y’ and A 
must be substituted into the Lagrangian (25.4) of the kth particle. One must 
first expand the first term in it in a series in powers of v2/c? and retain 
terms of the same low order of magnitude as the terms retained in y’ and A’. 
This gives 





346 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


2 2 4 

v v, v 
ah k al SE E) ; 
m,c l1— =m_,c~ |1— —> 5 25.15 
k c2 k 2c2 8 c4 ( ) 





As a result we find 








2 4 r 
mgu mgu e,(v,°A) 

k“k 1 Kk ' k\'k 

L, =—m,c? + 13 ey + = 
k k 2 Jp k 
2 4 

r O EVE WBE: z ejek 

—m, ee Ry = a 
8 o2 7 Uire Tl 


1 eiekYi Vk eier (x8) Y) (R=) Yk] 


2c? Ion 2c? Ir, —r,l3 





(25.16) 


The first term of the Lagrangian L, refers to the particle at rest, the second 
term has the meaning of the kinetic energy in the approximation of classical 
mechanics, and the third term has the meaning of the relativistic correction 
to the kinetic energy. The term in the brackets depends only on the instan- 
taneous positions and velocities of the particles. 

The quantity 


ejek 1 eiekYi Yg eik ((t, ri): Yi [Ekr Yk] 


lrg—r;l 2c? lIrg-r;l 2c?lry-—r;l? 





Uik 


> 


(25.17) 


depending on the distance between the ith and kth particle at a given instant 
can be considered as the generalized energy of interaction between the par- 
ticles. The first term has an obvious meaning: it is the potential energy of 
interaction between two charges, the ith one and the kth one, at rest. The 
other two terms, proportional to Vi Vale, represent the correction to the 
energy of interaction taking into account the motion of the charges and the 
retardation. However, it is clear that, although U;, is the energy of inter- 
action, it does not have the meaning of a potential energy depending only on 
the position of the particles. 

The expression (25.17) is completely symmetrical in the two charges. 
Hence it is easy to write the Lagrangian for a system of particles. Namely, 








§25 WEAKLY INTERACTING CHARGED PARTICLES 347 











2 4 
mU mgu e;e 
L=}; |—m,c?+ =+ =| izk 
k 2 8c2 koi (Ir-r! 
Cee Yi Yk pK [C-ri v; [r-r Ya] 
~ Aw 2 3 PAn 
2c? Ir,—r;l 2c Ir, -= r;l 
(25.18) 


where L, is the Lagrangian of a system of charges when relativistic correc- 
tions and the retardation are disregarded, given by formula (25.3), and L3 is 
an addition to it, found with an accuracy up to terms of order less than 


(v/c)3: 





i; mug Vi Vk 
by ag D EaD a 
ae Hae k>i lrg- r;l 
re—T;) vi] [Cry 
p Etot ezeo val] (25.19) 
Ir, —r,l3 


Knowing the Lagrangian, one can find the energy, mass and momentum of 
the system. 


The energy of the system is found according to the usual rules and is equal 
to 


a ðL = 
Evt" 2D o L=E,)+E,+E,, (25.20) 
k 


where 


and E> is the relativistic correction to the energy, which is equal to 
3 m RUE 1 Vi Yk 
A ee D ee, [+ 
fee 2c? (nS; Ir, =r! 


+ [r-r v; [rkr Yk] ] | 


PES (25.21) 


We see, first of all, that the energy of the system cannot be written in the 
form of a sum of the kinetic and potential energies. The relativistic correc- 





348 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


tion to the energy E£, depends both on the coordinates and on the velocities, 
so that taking account of this correction a potential energy for the system 
does not exist. Only for charges at rest or, more precisely, charges moving so 
slowly that quantities of the order of v2/c? can completely be disregarded, 
can the quantity E> be discarded and use be made of the potential energy 
(25.1). 

With the usual definition of mass, we find the mass of the system 





(25.22) 


Thus, the mass of the system is made up of the rest masses of the particles 
and the masses due to the kinetic energy of interaction of the particles of the 
system (in the approximation of charges at rest — the potential energy). The 
mass M obviously possesses no additive properties and is not equal to the sum 
of the masses of individual particles. The energy Æ of the system and its mass 
M are conserved. However, a conservation law cannot be written for indi- 
vidual terms entering into Erot and M. 
The momentum P of the system is by definition equal to 


= OL, 
Po Dorp A | (25.23) 


where P; = D> m,V,, is the usual value of the momentum in classical mechan- 


ics and P; is the relativistic correction to it: 


1 1 y; 
P, = — m,v?v = D OO [< 
2 au LG 2c2 ES; iik Ir, —r;l 
(r-ri) Yi 
+ a] (25.24) 
It, —r;l 


We see that the correction P, to the momentum depends on the coordinates 
of the particles of the system. 

It is easy to show that in the approximation considered one can introduce 
the notion of the centre of mass of the system, which does not exist in an 
arbitrary system of interacting particles. From the definition of the velocity 
of the-centre of mass (15.16) V,m =c?P/E ot it can be seen that the vector 


VA can be written in the form of the derivative of the radius vector of the 


SSS —_ am mR 


§25 WEAKLY INTERACTING CHARGED PARTICLES 349 


centre of mass with respect to time: : 











e: 
2 2 a 
> myc a 5 mv% tek >; lrg =F Ty 
k i 
z= > 25. 
Rom Ep ti; (25.25) 
One can convince oneself of this by a direct verification of the equality . 
< aR a Per 
c E 


which is valid with an accuracy up to quantities of the order of v?/c?. 
In addition to the energy and momentum, a system of material points 
possesses angular nomentum 


b= Dmx = Lot hi, (25.26) 
k 


where Lọ is the angular momentum of classical mechanics, and L} is the 
relativistic correction depending on the velocities and coordinates of all points 
of the system. 

We see that in this approximation (taking account of the corrections 
v2/c2) in the relativistic mechanics of the system one can introduce the same 
basic notions as in classical mechanics. However, in this approximation also 
the system possesses no potential energy. 

Thus, in addition to the case of a system of particles interacting via colli- 
sions as discussed in §15, one can construct the general mechanics of a sys- 
tem of interacting charged particles in the theory of relativity. However, in 
this case the theory has an approximate character and the highest terms re- i 
tained in it are of the order of v?/c? The account of subsequent terms of the il 
expansion in powers of (v/c) is possible only on concrete systems possessing no 4 
dipole radiation (the terms (v/c)3), no quadrupole radiation ((v/c)*) and so i! 

1 on. i ` 
In what follows we shall need another representation of the energy of ji 
interaction in the case where the system consists of two particles. Writing the 

Lagrangian of the first particle in the form 


2 1 


v 


= 2 
Ly =—my,c 


Toy Maal T (Vv, p> 





350 RELATIVISTIC ELECTRODYNAMICS Ch. 3 


we shall consider y, and A, to be the potentials of the field produced by the 
second particle at the point where the first particle is located at an instant £. 
Taking account of the retardation and for an arbitrary law of motion the 
potentials y} and A, represent Liénard-Wiechert potentials. The Liénard- 
Wiechert potentials y} and A, are related by formula (25.5) of Part I: 





_ V291 
e 
Hence 
2 vi YIS 
L; =-m,c 1-5 -e (1- Jor 
1 1 2 2 1 


Thus it follows that for the energy of interaction of two particles one can 
write the expression 





Walay2 
Vinterac =E (1-77) 9> (25.27) 
where ọ is the field potential depending only on the instantaneous distance 
R(T) between the. charges. 


§26. The radiation emitted by a moving charge 


Formula (28.4) of Part I for the radiation emitted by a moving charge is 
applicable only at velocities which are small in comparison with the velocity 
of light. In order to obtain an analogous expression valid at velocities close to 
the velocity of light, we introduce into the treatment a set of accompanying 
reference frames in one of which the particle is at rest at each instant. In 
every one of these reference frames the formula (28.4) of Part I for the radia- 
tion is valid. The radiation described by this formula has the character of 
spherical waves, so that the total momentum of the emitted electromagnetic 
waves is equal to zero. The energy emitted per unit time by a charge is, 
according to (22.16), an invariant: 


Lo == =... = invar . (26.1) 


The rate of loss of energy due to radiation can be written, according to (28.4) 
of Part I, 


§ 26 RADIATION EMITTED BY A MOVING CHARGE 351 


dE” Qe? on, e2 
Pe Že (w)? = a (w)? , (26.2) 


since in the accompanying reference frame w, = 0. The rate of loss of momen- 
tum due to radiation is, according to (28.5) of Part I, equal to zero: 


SP = 0; (26.3) 
dt 

To find the radiation in an arbitrary (unprimed) reference frame one need 
only transform the square of the acceleration (wa)? according to formula 
(11.17) to the acceleration in the unprimed reference frame. 

We then have 


2 2 
“2 V, sa u ) 1 "2 
v| —Xv va i) a Viv, 
dE_dE'_ 2e? È jF 2e2 c? eas ) 





Fyn tl) a ANE) ETA 2\3 
$ (: z ) 3 ( -+) (26.4) 
c 
In this case we obtain for the change in the momentum per unit time using 
(13.6) and (26.4) 





5 2 7 
v? (5) ae 
G aA M pe eon 26? CAU E Xi 
dt c? dt T 3c? (: VA 3 c? 
2 aac (26.5) 


It is obvious that for v/c << 1 formulae (26.4) and (26.5) reduce to (26.2) 
and (26.3). 

Formulae (26.4) and (26.5) allow one to find in an arbitrary reference 
frame, for example the laboratory system, the energy and momentum of the 
radiation field produced by a charge with an accelerated motion. 

As a rule, an accelerated motion of rapidly moving particles is associated 
with the action of the electromagnetic field on them. To transform formulae 
(26.4) and (26.5) in this particular case we make use of the expression (12.14) 
for the acceleration of a particle in an electromagnetic field. Putting in the 
value of the Lorentz force, we find 


j 2 
w=y= Vi [Et exn- Low (26.6) 
c C 


e 
m c 





352 RELATIVISTIC ELECTRODYNAMICS Chis 


Making use of this expression for the acceleration, we have 


: 2 ; 2 2\ 2 2 
# (1-5) + Sow (1-5) [(e+żvxn) + 
A c? m? 2 


c? c c 





aE), e l S e. 


2 

v 2 
ra me 2 2 2 2 

c ©; m (# C 


c4 


Kea (: =v) [py _ (26.7) 


m 2 2 c 


The rate of emission of energy by a charge moving in an electromagnetic 
field is equal to 


2 
(E++vx n) -A Qr 
c c2 








dE _ > 2e4 a 
dt 3m2c3 l v2 
TER: 
2 
4 2 2 3 2 
sa me (E+ vx H) Swe} - 
3m4c7 l v 6 G 
c2 
4 2 
ica [(e+4 vu) -40-5)| _ (26.8) 
3m4c7 c c? 


Consider several cases of formula (26.8) for the ultrarelativistic case v ~ c. 
Let there be an electric field only (i.e. H= 0). Then 


it Gy (Ceara (26.9) 


For v | E the rate of energy loss is 


dE 2e4 Py) 
Ce RY? (26.10) 
dt 3m4c7 


For v || E the rate of energy loss is 





——S—_a aS = —_ ~ 








§26 RADIATION EMITTED BY A MOVING CHARGE 353 
472 2 2e 4 
dE _ _ 2e"E Œ? (1-4) =- (E E)?, (26.11) 
dt 3m4c7 c? 3m?c? 


and does not depend on the energy. 


For the motion in a magnetic field perpendicular to the direction of the 
velocity (v | H) and E= 0 we find 


4 2 2 
dE, _ 2A Aa 2A pa E map 
dt 3m2¢5 pe v2 3m4ec z4 3m4c7 





c2? (26.12) 


vhere p is the momentum of the particle. 

Formulae (26.9)—(26.12) are used in nuclear physics for the determina- 
tion of-energy losses by ultrarelativistic particles moving in electric and mag- 
netic fields. The motion of charged particles in cosmic rays in the Earth’s 
magnetic field and in the magnetic field of a betatron are examples of the 
motion of ultrarelativistic particles in a magnetic field. Calculations have 
shown that energy losses due to radiation in 4 magnetic field determine the 
upper limit of the energy of particles which can reach the Earth’s surface, as 
well as the upper limit of energies to which electrons can be accelerated in a 
betatron. 

An important application of the formulae obtained is in the calculation of 
the bremsstrahlung of ultrarelativistic particles in the electric field of a nu- 
cleus. An ultrarelativistic electron passing by a nucleus undergoes a very small 
deflection. Its velocity can be assumed to be constant, and the acceleration 
can be assumed to be perpendicular to the direction of velocity and equal to 
wj = eE|/m, where 


E,= Zep/r? 


is the component of the nuclear field perpendicular to the velocity (we 
choose the direction of the latter as the x-axis). For the transverse accelera- 
tion use can be made of the non-relativistic expression, since the correspond- 
ing velocity component is very small. As in §43 in Part I, p is the impact 
parameter, and y is the distance between the nucleus and the electron. Fora 
motion with constant velocity it can be assumed that 


r= (o? + v22)’ 


Formula (26.10) gives for the rate of energy loss 





354 RELATIVISTIC ELECTRODYNAMICS Ch. 3 








dE 2e2 1 
Ş 2,3 (Ep? = 
dt 3m2c a v? 
c2 
apa PAE al p? 
3m2c3 Aa: v2 (p? +272) 
c2 


Integrating with respect to the transit time, we obtain the total energy loss 
due to bremsstrahlung by an ultrarelativistic particle: 





2Z2e4 p2 J dt 
J o (p2 + v212)3 © 








= dt re dt p Ae 4 3 TP 

peal 5 528 f cos zdz=57 > 
aes h: ip ) 0 (+42 0 

p? p? 

Hence 
204 

Aga Le leet (26.13) 

4 m2c3v v2) p3 

Te 


The magnitude of the energy loss increases rapidly with increasing atomic 
number Z of the matter in which the particle is moving. Formula (26.13) 
determines the energy loss of one particle passing by a nucleus at a distance p. 
It shows that the loss increases rapidly with decreasing p. 

In practice the particle may pass by the nucleus at any distance. Multi- 
plying (26.13) by 27p dpn, where n is the density of the beam, and integrat- 
ing over all values of p, we find the effective radiation of a beam of particles: 


§ 26 RADIATION EMITTED BY A MOVING CHARGE 355 





— | Pmin 

G 
2 2.4 
MEZE n 1 i (26.14) 
2 m2c3v 1-4) Pinin 

c2 


In formula (26.14) the closest distance of approach of the electron to the 
nucleus, Pmin» is introduced, since the integral diverges at the lower limit. 
The introduction of this unknown quantity means that the classical theory 
of radiation turns out to be inapplicable for the calculation of bremsstrahlung. 

In quantum mechanics it will be shown that the classical treatment of the 
motion of the electron is inapplicable at small distances. A quantum-mechan- 
ical calculation leads to the value 


is Be = ial 
min > 
me fv 
c2 
so that 
2 2 p4 
awe LC 1 
E s=- — ——_ —. 26.15 
eff 2 pele WD ( ) 
—— 
c2 


Formula (26.15) allows one to find the losses due to bremsstrahlung when 
very fast particles pass through matter. 

The comparison of (26.15) with the formula for energy loss due to ioniza- 
tion shows that bremsstrahlung is the basic factor determining the decelera- 
tion of fast.electrons in matter. Losses due to the bremsstrahlung are impor- 
tant for electrons at energies of the order of 200 mc? (100 MeV) in air and 
20 mc? (10 MeV) in lead. For heavy particles, for example protons, almost all 
losses are associated with the ionization up to very large energies. 

In conclusion it should be noted that one cannot pass directly from 
formula (26.14) to the non-relativistic formula (43.37) of Part I, assuming 
v <<c. Formula (26.14) is found for v ~ c, and not for the general case of an 
arbitrary velocity. 








APPENDIX I 


Vector Analysis 


Assuming that the reader is familiar with vector analysis, we summarize 
below the basic formulae used in this book. 
Vector algebra 
a=ai+a j +a,k= aoa, 


where i, j, k are unit vectors directed along the x-axis, y-axis and z-axis re- 
spectively, and ap is the unit vector in the direction of a. 


“> 

a‘ b= b-a=ab cos (a, b) = a,b, tab, ta b, ; (1.1) 
ijk 

aX b=—bXa=]/a, a = 
POPE (1.2) 
ba b, b, 


= (a,b, -a,b,)i + (a,b, —a,b,)j+ (a,b, — a,b, )k A 


356 


APPENDIX I 357 


laX b l= ab sin (a, b), (1.2’) 
a-(bXc)=b-(cXa)=c-(aXb), (1.3) 
aX (bXc) = bla -c)—c(a-b), (14a) 
(aXb)Xc = b(c-a)—a(c-b) (1.4b) 
(aX b)-(cX d) = (a: c) (b- d) —(a-d)(b-c). (1.5) 


a(b- c) = + {a(b-c)—b(a-c)} +4 {a(b-c) + b(a-c)} = 
= 5X (aX b)+} {a(b- © + b(a-c)}. (1.6) 
The basic formula of spherical trigonometry is easily derived from (1.5). 
Let r}, rz and r3 be the unit radius vectors of the apices of a spherical tri- 


angle ABC with angles a, 8, y as in fig. A.1. If in (1.5) we set a= rj, b=r3, 
c=r3,d=r], we find 


(ri X r3) (r; X r3)= r'r} - (r; r3) (1,13). (1.5°) 
By definition 


r'r =cosy, Hin ity FRSE, i316, =cosp, 
Ir; Xrjl=siny, Ir, Xrjl=sing, 
(r; Xr3)°(@, X r3) = sin $ sin y cos ô , 


where ô is the angle between the planes OAB and OAC. 





Gy 





358 APPENDIX I 


The substitution of these expressions into (1.5’) gives the basic formula of 
spherical trigonometry: 


cos & = cos 8 cos y + sin B sin y cos 6 (1.7) 
Vectors which do not change under the inversion of the coordinate axes 


í >(—r) are called polar vectors. The velocity vector, force vector etc. are 
polar vectors. Vectors which change sign under the inversion, i.e. 


a(r) = —a(—r) 
are called axial vectors or pseudo-vectors. 


The vector expressing the cross product of two polar vectors is an axial 
vector. 


Scalar field 


A scalar field y(r) is characterized by defining the scalar y at every point 
of space. The spatial rate of change of the scalar y(r) is characterized by the 
derivative with respect to a given direction I: 


20-20 shy Oey 
a1 ax CS (l,i) + ay cos (l,j) + az COS (l, k) 


P~ 
= Ill |V l cos (l, Vy), (1.8) 
where the gradient of the scalar y, 


20, 88; , 
Vo Belt ay dl" az Ke (1.9) 


represents the vector which is oriented in the direction of the most rapid 
increase of y and is equal to the derivative of y with respect to this direction. 
The magnitude of the gradient is equal to 


IVyl= (2) (22)°+ (22)° (1.10) 





APPENDIX | 359 
The Hamiltonian differential operator “del” V is defined by the relation 


aie eae (11) 
V is not represented as a vector. The Hamiltonian operator is a symbol indi- 
cating operations which must be done on the functions of coordinates stand- 
ing in front of it. Foy example, Vy means that one has to take the partial 
derivatives of the function y and construct a vector whose components are 
equal to these partial derivatives. 
From the definition (1.11) it follows that 





or or or 


9 OF 6 OF CLAS s 
Vi Ms a y SEE pe (1.12) 
Jat gra t2 L13 
dr drr’ (1.13) 
ViptW=VytVy, (1.14) 
Vow=VVytovy, (1.15) 
d i} 
VADS Vo. (1.16) | 
i 
| 
In calculating the gradient of functions which depend on the distance r ii 
between two given points, | | 
H 
r=V/(x-x9)* +0- + E-z), | 
one has to distinguish between the gradients with respect to coordinates l! 
(, y,2) and (x9, Yo» ZQ)- i 
We have iW 
i 
y, = Wk =X0) +1 =Y0) + KE ~20) HE 
See ee ee eS | 
z (1.12') | 


_ i x9) +i — Yo) + K-29) 
r š | 


Wy 





360 APPENDIX I 


Consequently, 
Vov) =- Ve). (1.17) 
The differentiation of a vector which depends on a scalar argument: 
da(x) d da dao da 
Si b a 
dx 





= Gx 200) a) = ag we ax =a ofan ti 


where ao(x) = a(x)/|a(x)|, b = awbp, bo 1 ag, and w = |dag/dx | is the rate of 
change of the angle y, which determines the orientation of the vector a. 
The total derivative of A(x, y, z, t) with respect to time is equal to 





dA _0A , dA dx , 3A dy , dA dz _ 
dt of dx dt dy dt dz dt 


MATEA SOA, “OAL OAy 
“Ey Us Fea wD ay) ez y ae Vi: CORY 





The line integral of a vector along a path is defined by the relation 
Jaa- fea, +a, dl, +a, dl,). 


The integral over a closed path is called the circulation: 
fa -dl . 
If the vector a can be written in the form 
a=Vọ, 


then it is called a potential, or irrotational, vector, and y is called the poten- 


tial. 
The circulatiomof a potential vector is equal to zero: 
fVo-dl=0. (1.19) 
The line integral of a potential vector along the path L between two points ry, 
and r3 is 


APPENDIX I 361 
[ved =y), (1.20) 
L 


where y(r,) and g(r) are the potentials at the end points of the line L. 

The scalar field is represented geometrically by plotting equipotential sur- 
faces y = const. The vector Vy is normal to the surface y = const. The more 
rapid the variation of the function y, the larger Vy and the smaller the separa- 
tion between the equipotential surfaces. 

The integral 


N 


fee) dS = = fon dS= lim Dy g, 48, 


Noo j=} 


is called the surface integral of the function y(r). The integral over a closed 
surface is denoted in this book by § yds. 

Let us consider the integral over the surface of an infinitesimal parallel- 
epiped of volume V > 0. We assume that the faces of the parallelepiped are 
infinitesimal areas (dx dy), (dx dz), (dy dz) in the coordinate planes (xy), 
(xz) and (yz) at the origin. 

We have, obviously, 


f ed = i{y(0 + dx) dy dz —y(0) dy dz} + 
v>0o 
+ j{p(O + dy) dx dz —y(0) dx dz} + 


+k{p(0 + dz) dx dy — y(0) dx dy} = 
= (; 284; 84,28 
(i 32+ 592 +k Ae) axayaz 


“(Rested 


whence we find the equality 


AD 8 
Vo= ij Z +k =i fends (1.21) 





362 APPENDIX I 


Formula (1.21) allows one to give another, integral, definition of the Ham- 
iltonian operator: 


vem e (1.22) 
v>o 





which is equivalent with (1.11). 

We emphasize that, since the integration surface reduces to a point as V 
tends to zero, the integral operator (1.22) does not depend on the form of this 
surface. 

The following integral relation results from (1.21): 


fedS=[Vyedr. (1.23) 


This relation connects the surface integral of the scalar y with the volume 
integral of the vector Vy. The volume of integration on the right-hand side of 
(1.23) is bounded by the surface y over which the surface integration on the 
left-hand side of (1.23).is carried out. 

In order to prove formula (1.23), we divide the finite volume into infinite- 
simal volumes, formula (1.21) being valid for each of these. We carry out the 
summation over all these volumes. The integration over all internal surfaces 
which represent interfaces of the volumes is carried out twice. In this case the 
directions of the external normals will be opposite, and the integrals over the 
internal surfaces will cancel out. There will remain only the integrals over all 
external surfaces, which form in the sum the integral over the surface bound- 
ing the volume V. 


Vector field 


The region of space in which the value of a vector a(r) is defined at each 
point is called a vector field. The vector field is represented graphically by 
means of field lines. The vector a(rọ) is in the direction of a tangent to.the 
field line at the point rg. The field lines are drawn in such a way that the 
density of lines is proportional to the absolute value lal. 

For the vector field one can define the flux of the vector through an area 
characterized by the vector dS in the form 


dj=a-dS. (1.24) 





APPENDIX I 363 


By the surface integral of a vector a we mean the quantity 


j= fa-dS=fa-nds= fa, as 
Ss S S 


= fa, dy dz + fa, dxdz + fa, dxdy , (1.25) 


where dy dz = dS cos (mi), and so on. 


The surface integral represents the flux of the vector a through the surface 
S 


If the surface S is closed, then the surface integral is denoted by $ a- dS. 
For the surface integral over a closed surface the Gauss-Ostrogradsky theorem 


da, ða, a) 
fa-as=f (FE + Se +S av (1.26) 


holds. 

The integration in (1.26) is carried out over the volume bounded by the 
surface of integration of the surface integral. 

The proof is obtained by dividing the volume into infinitesimal volumes. 


It can formally be performed by means of the definition (1.22) of the Hamil- 
tonian operator. 


Namely, 


fa-nds 


V-a= lim i? (1.27) 


v>o0 
whence, as in deriving (1.23), summing the elementary volumes we obtain 
Sv-aav=fa-as. 


Making use of the definition (1.11), we arrive at theorem (1.26). 
The scalar quantity which may be expressed by two equivalent representa- 


tions 
ða ða ða. 
E ae tare (1.28) 
V-a= lim funds (1.29) 


v>o0 





posku 


364 APPENDIX I 


which are based on different forms of the Hamiltonian operator is called the 
divergence of the vector a. The Gauss-Ostrogradsky theorem can be rewritten 
in the form 


fa-dS=[V-adv. (1.30) 


The divergence plays an important role in the theory of the vector field (see 
§2 of Part I of the book). 

From expression (1.29) it follows that V-a represents the flux per unit 
volume of the vector a(r) through an infinitesimal surface surrounding the 
given point r of the field. 

A vector field is called solenoidal, if at each point of the field 


V-a=0. 


This means that the flux of the vector through a transverse section of a tube 
formed by a group of field lines has a constant value along the tube. At those 
points of the field at which V -a #0 there are sources (V -a> 0) or sinks 
(V-a< 0) of the field. The numerical value of Va is called the intensity or 
abundance of sources of the field. 

The following obvious formulae hold: 


V-(a, +a,)=V-a,+V-a,, 


V-(ca)=c(V-a) (c= const), (1.31) 


` 


V-a(r)=—V, a(x), 


where r= Vx =x)? + (v-Yo)* + (z-2z9)?; and the index O denotes differ- 
entiation with respect to the coordinates x9, Vo, Zo- 
Let a(u) be a vector depending only on the scalar quantity u. Then 


V-a(u) = S -Vu=å-Vu, (1.32) 


where the dot over a denotes differentiation with respect to the argument u. 
Besides the operation Ẹ:a corresponding to the scalar product of the vec- 
tors V and a, one can consider the operation forming the vector 


VX a=curla. (1.33) 





APPENDIX I 365 


The vector representing the cross product of the Hamiltonian operator V and 
the vector a is called the curl of the vector a. 


A calculation according to formula (1.2), taking into account (1.11), gives 


i j k 
a coe Oe 
Vexcas Ox Oy dz 


a. a, a, 


da, ða ða ða. ) ða ða. ) 
=j T j x z ay Arx 
16 =) tJ ( az Ox +k( re pe esp 


Making use now of the integral expression (1.22) for the Hamiltonian 
operator, we have another representation of the vector: 





§nXadS 


VX a= lim 7 (1.35) 


v>o 


From the definition (1.35), in an analogous way to (1.23) and (1.30), it 
follows that 


fvxadv=fdSxa. (1.36) 


Let us now consider the projection of the vector V X aon to an arbitrary 


direction characterized by unit vector N. From the definition (1.35) it follows 
that 


N-(¥Xa)= tim NS @Ma)AS = ENAS 
v>0 v>0 


Since the result obtained after passing to the limit does not depend on the 
form of the surface, the latter can be chosen arbitrarily. If we direct the z-axis 
along N and choose a cylinder with base S and height h as the surface, then 


fa-(NXn)ds= fa-dlh. 


The integration in the expression on the right is carried out with respect to 
the lateral surface of the cylinder, since at its bases nll N and their cross 


product reduces to zero. A length element at the lateral surface perpendicular 
to the vectors n and N is denoted by dl. 





366 APPENDIX I 


Hence it follows that 


- h 1 
N-(vXa)=lim = fa-dl= lim > a-dl. 
yo Sh j s>0 5 if 
Dividing an arbitrary volume into small volumes as previously and carrying 
out the summation over them, we find that the surface integrals over the 
internal surfaces and the line integrals over the edges of adjacent cells cance 
out, so that 


fa-dl={(V¥Xa)-dS. (1.37) 


Formula (1.37), connecting the line integral of the vector a over an arbi- 
trary closed contour with the surface integral of V X a, is called Stokes’ 
theorem. 

From the derivation of the theorem it is clear that the integration is carried 
out over an arbitrary surface bounded by the contour of integration. 

The geometrical meaning of the idea of curl becomes clear from Stokes’ 
theorem. In order that the integral of the vector a over a closed path be differ- 
ent from zero it is necessary that certain lines (even field lines) have the 
character of closed curves. Such lines, for example, are the lines of the vector 
of the velocity of rotation of a solid body, or the flow lines of a liquid per- 
forming a circulatory motion. Hence the term curl, 

From Stokes’ theorem it follows that: 

1) Ifa = Vy, then $ a- dl = 0. Consequently, 


VX a=VX Vy=0,7 (1.38) 
i.e. if a is a potential vector, then the field of the vector a is irrotational. Con- 
versely, any vector whose field is irrotational is a potential vector. 
2) If the vector a is solenoidal, so that 
V-a=0, 
then it can be written in the form of the curl of a certain vector c: 


a=VXc. 


Conversely, if a vector field a can be written as the curl of a field c, then the 
field a has solenoidal character. 





APPENDIX I 367 


V:(VXc)=0. 


Sources and sinks are absent in a solenoidal field. The proof of these state- 
ments is obtained by a direct calculation. For example. 


V-(VXc)=c-(VXV)=0. 


The formation of the scalar V-a and the vector V X aare the basic opera- 
tions of differentiation of a vector. 

The divergence and curl of a vector a determine the vector field (see §2 of 
Part I). 

In calculating the divergence and curl of a vector a(u) which depends on 
the scalar argument wu it turns out that 


-a(u)=Vu- t= Vu a 

V-a(u)=Vu aa Vu-a, (1.39) 
da a 

VX alu)=VuX Go =(Vu) Xa, (1.40) 


where the dot over a denotes differentiation with respect to the scalar u. 

Calculation of the derivatives of a product and repeated derivatives. These 
operations are carried out most simply by means of the Hamiltonian operator. 
In this case the following two rules are to be adhered to: 

1. The Hamiltonian operator must act successively on each scalar and vec- 
tor following it. 

2. The Hamiltonian operator can be treated as an ordinary vector, but it 
cannot be transposed with the quantity on which it acts. For clarity, in carry- 
ing out intermediate transformations we shall indicate the quantity acted 
upon by the Hamiltonian operator by a subscript, for example, V pO Va: 

The most important examples are: 


1 Vlyy)=YV yt YV =YV yY +t yV. (1.41) 
2A V: (ya)=a:V y tyv, a=a:Vy+tyV-a. (1.42) 


In particular 








368 APPENDIX I 
V X (ya) =4(V,Xa) + (Vy) X a=AVXa)+(Vy)Xa. (1.43) 
2). V-(aX b)=V, (aX b) + V, "(aX b)= 
=b-(V, Xa)—V,,-(bX a) = 
=b-(V, Xa) -a (V, Xb)= 
=b-(VXa)—a:(VXb). (1.44) 
We have carried out a cyclic permutation of the vectors. In the second 
term the order of vector multiplication has been changed. Otherwise rule 2 
would be violated in the cyclic permutation: the vector b would be moved 
behind the symbol V,- 
4. V X(aXb) = V, X(aXb) + V, X(aXb) = 
=(V,-b)a—(V,-a)b + (Vj, b)a - (V, a)b= 
= (b-V,)a—b(V,-a) + a(¥,,- b) —(a-V,,)b= 
=(b-V)a—(a:V)b+a(V-b) — b(V-a). (1.45) 


Here a-V is the scalar differential operator 
yo ce, ata, A ee (1.46) 


S V(a-b) = V (a:b) + V, (a:b) = 
=(b-V,)a + bX(V, Xa) + (a-V,)b + aX(V, Xb) = 


=(b-V)a+(a-V)b+bx(VXa)+aX(VXb). (1.47) 


6. Via? =(a-V)a+aXx(VXa). (1.48) 
2 2 2 

7 V- Vy=V?y= maie (1.49) 

J əx? ðy? ðz? 





APPENDIX I 

8. VX(VXa)=V(V-a)—(V-V)a = 
=V(V-a)—V2a. (1.50) 
9. V- (ab) = b(V-a)+(a-V)b. (1.51) 


The product (ab) of the two vectors a and b is known as dyadic. The proper- 
ties of dyadics are treated in books on vector analysis. 
From the integral representation of the Hamiltonian operator the formula 
1 
V-(ab)= lim =} ¢(n-a)bdS 
v0 vs 


follows, from which one obtains directly 


J¥-(ab)av= f(n-aybas, (1.52) 
or by virtue of (1.51), 
f(n-aybas= focy-a)av + [(a-yybav (1.53) 


We obtain also another integral equality: 
Jox(vxa)av+ fiaxy)xb av =-f(nxa)xbas. (1.54) 
From the integral representation of the Hamiltonian operator the formula 
aes) | 
(VXa)Xb= lim yf (xa)Xxb ds 
v0 
follows, whence 


J(vxa)xbav = f(x a)xbas. (1.55) 


On the other hand, we have 


(V Xa) Xb =(V, Xa) Xb + (V, Xa) Xb =—bX(VXa)—(aX V)Xb. 


Substituting this into (1.55), we arrive at (1.54). 

















——<&<—— SST 


a a EAE A S 


370 APPENDIX I 


Representation of vector operations in curvilinear coordinates 


‘In addition to Cartesian coordinates it is often convenient to make use of 
curvilinear coordinates 44, q2, 43- To each point r there corresponds a set of 
quantities q; , 92 and q3, i.e. 


r =r(4;; 42:43). (1.56) 


Since vector operations are not connected with any particular coordinate 
system, the relations of vector analysis remain valid in all coordinate represen- 
tations. However, the actual expression of vector operations in curvilinear 
coordinates is, naturally, not the same as that in Cartesian coordinates. 

In this book only orthogonal coordinates are used. Let us call the three 
surfaces 





Fig. A.2 


APPENDIX I 371 
q; = 9;(%, y, z) = const CE2) (1.57) 


the coordinate surfaces, and the lines of their intersection the coordinate 
lines. It is obvious that the set of three coordinate surfaces is a generalization 
of the coordinate trihedron in the Cartesian coordinate system. The directions 
of the coordinate lines are characterized by unit vectors €}, 5, €3 (fig. A.2). 
Reference frames in which the vectors €}, €, €3 are mutually perpendicular 
are called orthogonal coordinate systems. 


The derivative with respect to coordinates q; is equal to 


ðr or(q}, 42:43) F 
i a He, (i= 1, 2, 3). (1.58) 


The vector dr/dq; is directed along a tangent to the coordinate line q;, and 
its value H; is a function of the coordinates q4 , 42, 43. Obviously, 


dr\?_ /ax\? , (ay)? , (az \2 
H2= (=) = A (24) + (2) 1.59 
i Nq; (50, òdi ðqi Eg 


The three quantities H; (i= 1, 2, 3) are called the Lamé coefficients. By 
means of them one can write 


dr = H, dq,e, + H dq e, + H3dqe; . (1.60) 

or, in components 
(dr) =H, dq; ; (dr), =H dq, ; (dr)} = H3 dq} . (1.61) 
Squaring (1.60), we find the square of the length in orthogonal coordinates 
(dr)? = ds? = H? dq? + H3 dq? + H2 da3 (1.62) 


An element of area is easily found by considering an infinitesimal parallel- 
epiped formed by the coordinate surfaces. The areas of its faces are 


dS; = H H, dq, dq}, dS, =H H; dq] dq, , 
(1.63) 
dS; = H H, dq, dq, . 


The volume of the parallelepiped is equal to 





372 APPENDIX I 


dV = H,H>H, dq, dq, dq} - (1.64) 


The most important orthogonal coordinates are cylindrical and spherical 
polar coordinates. 
In spherical polar coordinates 


4 Fr, 4,=0, q43= Y. (1.65) 
Obviously, we have 
(dr), =dr, (dr), =rdé, (dr); =r sin 0 dy (1.66) 


Comparison with (1.61) gives 


H.=1, H,=r, HA, =rsin@g; 

á 4 (1.67) 
dV =r? sin@ dr dô dy . 

In the cylindrical polar coordinate system 
Wy=P, 4.=W, 93=2, 
(1.68) 
(dr), =dp, (dr), =pdy, (dr), = dz, 
and from (1.61) 

He=1, Hiv=p, HH. =1., 

p ý 5 (1.69) 


dV=pdp dy dz. 


Expressions for vector operations are obtained from their definitions, tak- 
ing into account the relations which have been written out: 
e; 0p e, dy e} dy 
=— + + 
Hy ðq; H, ðq2 H3 343 





(1.70) 


In particular, in spherical and cylindrical polar coordinate systems we have 
respectively: 


22, oD e, dy 
r ðr rsin ow’ 





Vy=e (1.71) 


r 00 


ret ee ee - - —— — + ana 


APPENDIX I 373 
dy ey dy ay 
A eal Sa o a wo ezz (1.72) 


1 


FAE Z (apHyHy) + Š o (a3H1H)| « (173) 
2. V-a= AAA HH) + 50 ae 3H) 3 





In spherical and cylindrical polar coordinates respecuvely: 


1 a(r2a 1 a(sinéa 1 da 
Wows E oGinbag) sl aay 


; (1.74) 
r ar rsin@ a6 rsin@ dw 
1 a(pa,) 1 da, da, 
eerie op 5 av ager (1.75) 
1 0(a3H3) (az H3) 
3 xX sla \ — eE 1.76 
(VX a), = ET meee (1.76) 


Other projections are obtained by the cyclic permutation of the coordinates 
172533 


In spherical and cylindrical coordinates respectively: 








a cee! a(a, sin 0) 1 day 
Yx a), ~ rsin 6 06 ~ rsin Ow i C7) 
tO il (ray) 
Wars a), ~ rsin@ əy ro’ (78) 
l dar 1 ða, 
SOR i r pe) 
(VX a) el Lal (1.80) 
P p ow Gy ° 4 
ða ða. 
Co ee (1.81) 
alay p) -1 da, 





374 APPENDIX I 


4. The Laplacian operator is 


3 1 ə e= a ) 
Vo= + 
Hy H2H; \ðq4; \ Hy 9q, 


ð /H3H, ð ð (H\H2 39 
+ ( )+ ( ) : (1.83) 
0q2 Hy 0q2 0q3 H3 ðq3 


In spherical and cylindrical coordinates respectively : 


y2=4 (7 =) + | > (sina) 
r? ar ar) r?sinð 00 00 











1 a? 
a (1.84) 
r? sin20 əy? 
2 2 
P= Alp S ra (1.85) 
pôõp\ dap/ p?əy? az? 


Bibliography 


N.E.Kochin, Vektornoe ischislenie i nachalo tenzornovo ischisleniya (Vector calculus 
and the principles of tensor calculus) (Acad. Sci. USSR, 1963). 

Y.Il.Frenkel, Kurs teoreticheskoi mekhaniki (A course of theoretical mechanics) (Gos- 
tekhizdat, 1940). 

M.Schwartz, S.Green and W.A.Rutledge, Vector analysis (Harper, New York, 1962). 





A Se a eo -= e æ iane a 


APPENDIX II 


The Fourier Integral 











Any function which is periodic in the region / = 2r/%w, i.e. a function satis- 
fying the condition 


sa+n=f (1+ 2)=r0, 


can be expanded in a Fourier series: 


oo co ; 
f= > (a, cos nwt + b, sinnwt)= >) fa eni e 


n=0 n=—co 


The Fourier coefficients are given by the formula 


nw 


Loar f o TE Ro 


— m/w 


The expansion in a Fourier series means that an arbitrary periodic function 
with a period 27/a) can be written in the form of a superposition (spectrum) 
of an infinitely large number of monochromatic functions with periods 27/w, 
2n/2w, ..., 2m/nw or frequencies w, 2w, ..., nw and So on. 


375 





376 APPENDIX II 


Conditions under which the expansion in a Fourier series is possible are 
usually fulfilled in physical applications. 

Passing to the limit where the period increases indefinitely (i.e. w > 0) and 
the frequencies draw nearer together, one can obtain the expansion in a 
Fourier integral: 


f= f F(w) eb! deo. (11.1) 


The function F(w), called the Fourier transform of the function f(t), is given 
by the formula 


F(w)=5- f se dr. (11.2) 


The expansion in a Fourier integral is possible if the properties of f(t) ensure 
the convergence of (II.1)—(II.2). In physical applications f (t) usually tends to 
zero as t > + œ, which ensures the convergence of these expressions. 

The Fourier integral can be written in a more symmetric form: 


l a iw 

f(t a= J me 'F(w) dw, (11.3) 
= 1 r —iw 

Flo) = Jo tflÒdt. (11.4) 


If f (ċ) is a real function, then F(w) is a complex function, and 
F*(w) = F(—w). (11.5) 
Formulae (11.3) and (II.4) can be combined in the form 


oo 


f= = f eet des f are" FQ), (11.6) 


which is also often called the Fourier integral. 





Ee o oo es — = = — a oo a 


APPENDIX II 377 | 


Formula (II.3) shows that f(t) represents the sum of monochromatic terms 
eiwt which are taken with weights (amplitudes) F (w) dus/\/2n | 
The complex amplitude F(w) can be written in the form 


F(w) = A(w) el?) . (11.7) 


where A(w) is the modulus and y(w) is the phase of the function F(w); they 
are real functions of the frequency w. In such a representation for the Fourier 
integral we have” 


f= ie S Ale) HOPED dey . (11.8) 


We prove an important equality which is sometimes called the Parseval 
relation for the Fourier integral: 


oo 


f CO) dt = f IF(w)l? deo. (11.9) 


—oo 


Now, 


f ror des i fo) i A(w) eller o()) as} dt 


co s — 00 


co 


S A(w) el?) [f f(t) bv! ar) dw . 


—°co 


A 


But, by the definition (II.4), 


le 





ii f(t) elot ay = F*(w) = A(w) eiv(w) 


V2 ~ 00 





378 APPENDIX II 
Hence 
co co 


f GO) d= f A@)? dwf IF)? dw, 


which was to be proved. 


aS ee ee a ee et oa ee 


APPENDIX III 


pn 


The Delta-function and its Properties 


The ô-function was introduced by Dirac * and proved to be very useful in 
considering many problems of theoretical physics. 
The 6-function is defined by the relations 
6(x) =0 for x#0, 
d(x)=e for x=0, 
so that 
b 
f 8@) dx = 1, wherea<O<b. (I.1) 


a 


The basic property of the 6-function 


b 
Í f(x) 6(x) dx =f(0), a<0<b, (111.2) 


* P.A.M.Dirac, The principles of quantum mechanics (Clarendon Press, Oxford, 1958). 


379 





380 APPENDIX III 


(where f(x) is an arbitrary continuous function of x) follows immediately 
from the definition (III.1).” 

Indeed, because of the properties of the 5-function, only the neighbour- 
hood of the point x = 0 is of importance in the integral (II.2). Then for the 
point x = 0 the function f(x) can be brought out of the integral, and the 
remaining integral is equal to unity by virtue of (III.1). The integral (I.2) 
can also be rewritten in the form 


JT 8% —x9) dx = f(xy) - (111.3) 


The range of integration in (III.3) must include the point x = xg, otherwise 
the integral reduces to zero: 


b Xo S10) 55 
JF) 5 —x9)dx=0, (III.3’) 
a Xo <a 


The delta-function cannot be contained in any final expression. When the 
6-function is written a subsequent integration with respect to the variables on 
which it depends is always implied; the 5-function can be considered as a limit 
of consistency for analytical functions. 

In particular, such properties are possessed by the expression 


which behaves as 5(x) when a > œo. Indeed, for x = 0, F(a, x)| y=ọ is equal to 
ajn and diverges as a >œ. For x #0, F(a, x) strongly oscillates about the 
zero value with a damped amplitude. Finally, 


so . 
Hi sin &x kee l 
TX 


—oo 





for any a. Consequently, we see that 


Sin ax _ 


lim 


aco 


5(x). (IIL.4) 





EE — et a a a a T 
| 


APPENDIX III 381 E> 
The integral of the form | 
ec 
j 
Se ; j 
S elk dk, | 
—oo 
which is often encountered and which should be understood as ` 
ats 2 
lim kX dk= lim “sinax 
Q> —a Q> 


can be expressed in terms of the 6-function. Comparing with (11.4), we 
obtain 


co 


ff elKX dk = 278 (x) . (111.5) 


—oo 


Other representations of the 6-function are 


1 a 
“6@)=— lim ———_, II.4 
9 T a>0 x2 +02 ( ) 
W im coma (111.4") 
i T a>0 
exla m 
6(x) =— lim (111.4) 


ao, a(ex/% + 1)2 | 


The delta-function can also be defined as the derivative of a particular dis- 
continuous function e(x): 


ex)=0, x<0, 


(111.6) 
ex)=1, x>0. 


It is obvious that e'(x) = 0 for x £0. We show that the equality (III.2) also 





382 APPENDIX III 


holds: 
b b b 
f Idse- S ew s'@) ax 
a a 
b 
= f(b) — 1 f'Œœ)dx =f(0). 
Consequently, 


e'(x)=8(x). (111.7) 
We list certain basic properties of the 5-function: 


5(—x) = 5(x), 

5(—x) = —5'(x), 

x5(x)=0, 

x8'(x) = —6(x), 

5(ax) = È 5(x), a>0, 

B(x? a?) = + [8 (x — a) + 6(x + a)] , se) 
J &(a-— x) ô(x — b) dx =5(a—b), 

F(x) 5(x — a) = f (a) 5(x — a) , 

S f(x) 8'(x-a)dx=-— f'(a), 

5(f) df= ô(x)dx , 


5LF) = papax 8 —*0) 


where xo are the roots of the equation f(xo)=0, 





APPENDIX III 383 





fee bf) —a] = a | fix) =a 


ôr- ro) = ô& -— xo) 5 - Yo) (z - z0), (liad ) 


S(r vi) = Ł feller 29 Slo ku) dk deo. 


Since the ô-function has a meaning only provided that integration with 
respect to its argument is implied, then these equalities also mean that, multi- 
plying the left-hand side of each of them by the continuous function f(x) and 
integrating with respect to x, we shall arrive at the same results as those which 
will be given by the right-hand sides. Let us prove, for example, the third rela- 
tion. For this we consider the integral 


JPG) x8) dx = f(x) x19 = 0 


and the relation is proved. 

The -function often proves to be useful in considering Fourier transforms. 
Thus, if we have the expansion of a certain function f(x) in a Fourier trans- 
form 


fa)=f clk eit dk, (111.9) 


then, making use of (III.5), we immediately obtain the expression for the in- 
verse Fourier transform. Indeed, multiplying the left-hand and right-hand sides 
of the equality (III.9) by e~‘*’* and integrating with respect to x, we obtain 


co 


Í f(x) ek dx= f c(k) eik -K dk dx = 


—oo 


co 


J 


—oco 


c(k) dk f el(kK-K x dx= 
— 00 


oo 


= f c(k) 2n8(k—k') dk = 27c (k') . 


—co 





384 APPENDIX III 


Consequently, we have 
in —ik'x 
c(k’) zJ. dx. (1.10) 


Analogous relations are also obtained in more complex cases. Formula (II.5) 
can be considered as the expansion of the 6-function in a Fourier transform. 
We now consider the Legendre polynomial expansion of the 5-function: 


8(1-$) =D) BP) 
l 


We find the coefficients B, by multiplying the left-hand and right-hand sides 
of the equality by P)(§) and integrating with respect to ¢. Taking into 
account that 


+1 


2 
f l Pi($) PAS) dk = 5757 8 yi » 


and 
P(1)=1, 


we obtain 


Finally we have: 


a-H=4 Dy (2/4 1) PS). (11.11) 
l 


Let us prove the following relation: 


v2(4) =— 4n 6(r), (111.12) 


where 


8(r) = 5(x) 8(y) 8(z) . 


For this we expand the function 1/r in a three-dimensional Fourier integral: 





APPENDIX III 385 
es ik-r 
= fel elk F dk, dk, dk, . (111.13) 


Correspondingly, for the function c(k) =c(k,, ky, k,) we have. making use 
of (III.10), 


a lew e~ik-r gy (111.14) 
T 


In formula (III.14) we integrate first with respect to the angle, choosing the 
polar axis in the direction of the vector k: 


c(k) = ae fi Pafe e ikr cos 9 Sin 9 dY 





The last integral is usually calculated by multiplying the integrand by the fac- 
tor e~% and by subsequently letting œ > 0. We obtain finally 


Substituting c(k) into (III.13), we have 


draait A. gik-r gk. 


r a (27)? 


Taking the Laplacian of the left-hand and right-hand sides, we obtain 


Toe A 


and, consequently (see III.5) 


1 b A fe 
y2 (*) a asi (2m)? 8(x) 8(y) 6(z)=—4r (nr). (LTS) 





386 APPENDIX III 


In conclusion we note that, as follows from (III.1), the 5-function is a 
function with dimensions, its dimensionality being the inverse of that of its 
argument. 

It is often convenient to make use of the sign function sgn x which is de- 
fined as 


+1 x>0, 
sgn x = | (IIT.16) 
-1 x<0 


Its Fourier transform Fsgn and the integral representation is: 


N -a|x| piwx 
F. (w)= lim — sgn x e e dx = 
ch ao 27 Ie 
= iia e ae (111.17) 
2 
&a>0 7 wta? 7 w 


where P denotes the principal value of the function 1/w: 


Ij x#0, 
P(1/x) = | (111.18) 
0 x=0. 
In this case 
Sre (Z)ax=P fF ax, (111.19) 


where P f is the principal value of the integral 
The inversion gives 


e—iwx 


j oo 
ei dw . 2 
sgn x = = Pfa z dw (11.20) 








SUBJECT INDEX 


Absorption of radiation, 163 

Accelerated motion, 241 

—— in relativistic mechanics, 335 

Acceleration, 260 

—, four, 252 

— of charge, 351 

—, self-, 123 

Action at a distance, 93, 237 

Addition of velocities, Einstein’s law of 
235, 248 

Advanced potential, 92, 100 

——, boundary condition for, 93 

Aether, universal, 220 

Algebra, vector, 356 

Amplitude of plane wave, 151 

Analysis, vector, 3, 356 

Angle, transformation of an element of 
solid, 316 

Angular momentum, 84 

—— of interacting particles, 349 

—— of the electromagnetic field, 48 

— transformation, 310 

——, relativistic, 236 

Annihilation, positron—electron, 284 

Antisymmetric tensor, 254 

Atomic model, planetary, 196 

— nuclei, quadrupole moment, 62 

—-, stability of, 276 

Axial vector, 23, 358 


Balance of forces, 121 

Betatron, 338 

— mode, 339 

Binding energy, nuclear, 277 
Biot—Savart law, 75 

Boundary condition, 164 

—— for advanced potential, 93 

—— for electromagnetic potentials, 86 


—— for retarded potential, 93 
—— for scalar potential, 51 
—— for vector potential, 95 
Bremsstrahlung, 209, 353, 355 


Carrier frequency, 152 

Causal description, 94 

Centre of charge, 61 

Centre of mass, 348 

Centre-of-mass system, 190, 272 

——, cross-section in the, 208 

——, laboratory system and, 208 

—, velocity of, 273, 349 

Charge (see also charged particle), 10, 13 

—, centre of, 61 

— conservation law, 11, 17, 29, 87, 292 

—, continuous distribution of, 14 

— density, 14,52, 109 

—— and Lorentz transformation, 293 

— dipole interaction, 65 

—, sign of the, 10 

—, total, 109 

Charged particle (see also charge) 

——, accelerated, 117 

——, electric field of uniformly moving 
relativistic, 298 

——, electromagnetic field of, 135 

—~, electromagnetic field of harmonic 
moving, 118, 119, 125, 137 

——, electromagnetic field of relativistic, 
296 

——, equations of motion for, 328, 329, 
341 

—— in electric field, 174, 332, 334 

—— in electromagnetic field, 11, 179, 
187, 327 

—— in magnetic field, 118, 176, 182, 184 


on & 














388 SUBJECT INDEX 


Charged particle (continued) 

——, magnetic field of uniformly moving 
relativistic, 299 

——, radiation of moving, 106 

—-—, radiation of relativistic, 351 

——, system of two, 120 

——, two, interaction force between, 189, 
300 

— $phere, uniform motion of, 301 

Circular polarization, 152 

Circular of vector, 3, 360 

Classical field theory, 11 

— invariance transformation, 243 

Coherent scattering, 157 

Collision, 197 

— between relativistic particles, 270, 286 

—, elastic, 198 

—, electron—nucleon, 353 

—, head-on, 287 

— of particles with nonzero rest mass, 

286 

—, photon-—electron, 289 

— process, irradiated energy in, 212 

—,,.proton—proton, 285 

—, relativistic particles in, 270 

Compton effect, 289 

— scattering, 289 

— wavelength, 124, 290 

Conservation, charge, 11, 87, 292 

— law, charge, 17, 29 

——, energy, 42, 44, 265, 279 

——, energy-momentum, 271 

—-— for the motion in a constant mag- 
netic field, 178 

——, momentum, 45, 265, 283 

Continuity, equation of, 18 

Contraction of a moving scale, 233 

Converging spherical wave, 90 

Coordinates, spherical system of, 114 

Coulomb gauge, 40 

— law, 10, 53 

— scattering, differential cross section for 

203 

Cross section, differential, 158, 197 

——, differential, for Coulomb scattering, 
203 

—— in the centre-of-mass system, 208 

—— in the laboratory system, 208 


—-—, scattering, for unpolarized radiation, 
159 
——, total, 198 
——, total scattering, transformation of 
the, 319 

Curl of a vector, 365 

Current density for quasistationary mo- 
tion, 70 

——, scalar potential and, 110 

—— vector, 16 

——, vector potential and, 110 

—, displacement, 28, 30 . 

—, character of displacement, 33 

—, electric, 16 

—, four-dimensional, 292 

—, harmonic, 100 

—, magnetic moment and closed, 83 

—, non-steady, 23 

—, total, 28 

Curvilinear coordinates, 370 

Cyclotron, 338 

— frequency, 117, 337 

Cylindrical polar coordinates, 372 


D’Alembertian, 36 

D’Alembert’s equation, 36, 94 

— method, 88 

— operator, 294 

Damped oscillator, 126 

Damping coefficient, 126 

Decay, meson, 279 

—, -meson (muon), 281 

—, particle, 275 

——, elementary, 279 

——, spontaneous, 276 

—, n+-meson, 280 

-, n?-meson, 282 

Deceleration of fast electrons in matter, 
355 

Defect, mass, 277 A 

Deflection angle, 199 

— of particles in a magnetic field, 338 

Delay time, 91 

——, proper, 108 

Delta-function, 379 

——, basic properties of, 379, 382 

——, Legendre polynomial expansion of, 

384 


= = 


SUBJECT INDEX 389 


Differential cross section, 158, 197 

Dimensions of wave packet, 154 

Dipole, 55 

— approximation, 55, 111 

—— of retarded potential, 106 

— dipole interaction, 65 

— electric field, 57 

— moment, 55 

—— and the origin, 56 

——, changing, 112 

—— density, 56 

——, orientation of, 65 

— radiation, 116, 341 

—-, magnetic, 131 

Dispersion formula of classical electro- 
dynamics, 160 

Displacement current, 28, 30, 73 

——, character of, 33 

——, experimental verification of, 31 

——, introduction of, 32 

——, logical necessity of, 31 

Divergence of a vector, 364 

Diverging spherical wave, 90 

Doppler effect, 309 

——, longitudinal, 310 

——, second order, 312 

——, transverse, 311 

Drift, 180 

Dyadic product of vectors, 369 

Dynamics, equations of, 256 


Effective potential energy, 192 

Einstein's formulae, 264 

— law of addition of velocities, 235, 248 

Electric and magnetic field, symmetry be- 
tween, 33 

— field, 12, 112 

—— and slow uniform motion, 77 

——, determination of, 26 

——, dipole, 57 

—— ofa system of point charges, 52 

—— of moving charge, 298 

——, particle accelerated by an, 334 

——, quadrupole, 61 

——, relativistic, 304 

—— strength, 12 

——, transverse, trajectory of charge in, 

333 


—~, uniform, charged particle in, 174 á 

——, uniform, relativistic particle in, 331 

Electrodynamics, dispersion formula of 
classical, 160 

Electromagnetic field, 18 

—— asa mechanical system, 170 

——, charged particle in, 179, 187 

—— energy, 42, 67 

—— equation, 164 

—-— in induction zone, 140 

—— in radiation zone, 140 

—— invariances, 307 

—— of moving charge, 296 

——, particle trajectory in, 181 

——, reality of the, 43 

—— strength in the plane wave, 146 

—— tensor, 305 

— induction, 23 

— interaction, 10 

— potential, boundary conditions for, 86 

——, equations for, 85 

——, initial conditions for, 86 

— wave, propagating, 143 

——, scattering of the, 157 

Electromotive force, 16 

Electron, collision, photon—, 289 

—, fast, deceleration of, in matter, 355 

— nucleon collision, 353 

— positron pair production, 283 

—, radius of, classical, 266 

Electrostatic field, 12 

——, Maxwell’s equations for the, 15 

——, potential energy of the, 63 

——, work performed by the, 13 

— potential, 16 

Electrostatics, direct problem of, 50 

—, equations of, 15 

—, inverse problem of, 50 

Elementary particle, 9 

—— decay, 279 

Elliptical polarization, 152 

E.m.f., 16 

Emission of energy, rate of, 352 

Emitted energy, flux density of, 115 

——, spectral distribution of, 137 

——, total flux of, 116 

— line, half-width of, 128 





390 SUBJECT INDEX 


Emitter, 93 

— of radiation, 112 

Emitting particle, momentum loss of, 117 

— system, Poynting vector of, 115 

Energy balance, 122 = 

— conservation law, 42, 44, 265, 279 

—, electromagnetic field, 42 

—, equivalence of mass and, 264 

—, field, 171 

— flux, 43 

—— of radiation, 117 

—, interaction, 341 

— interaction, of system of charges, 65 

—, kinetic, 265 

— momentum conservation law, 271 

—— four-vector, 262 

— ofa particle, 260, 263 

— of interacting particles, 347 

— of nuclear reaction, 278 

—, radiation, 350 

—, rest, 264 

—, total, 265 

—, transformation of an electromagnetic 
wave, 317 

Equipotential surface, 361 

Equivalence of mass and energy, 264 

Ether theory, 31 

Experimental verification of displacement 

current, 31 
—— of Lorentz contradiction, 32) 


Faraday law of induction, 35 
—— of induction (integral form), 24, 26 
Field, electric, 12 

—, electric, strength, 12 

— energy, electromagnetic, 42. 
— lines, 362 

— oscillators, 167 

——, energy of, 173 

——, number of, 171, 173 

—, scalar, 358 

—, self, 29 

—, vector, 3, 362 

Fluorescence, resonancé, 160 
Flux density, 197 

—— of emitted energy, 115 

—, total, of emitted energy, 116 
—, vector, 3, 362 


Focussing, 178 

Force on a system of charges, electric, 64 
—, transformation of a, 326 

Four acceleration, 252 

— dimensional current, 292 

—— wave vector, 308 

— force, 257 

——, fourth component of, 258 

— momentum, 257, 262 

— potential, 294 

——, Lienard— Wiechert, 302 

— tensor, 253 

——, invariants of, 255 

——, second-rank, 253 

——, transformation of, 254 

— vector, 250 

——, energy—momentum, 262 

— velocity, 251 

Fourier coefficients, 375 

— integral, 375, 376 

— series, 375 

— transform, 376 

— transformation of vector potential, 96 
Fourth coordinate, 244 
Frequencies, transformation of, 308 


Galilean principle of relativity, 219 
— transformation, 218 

——, Lorentz and, 227 

Gauge condition, 295 

—, Coulomb, 40 

— invariance, 38 

—, Lorentz, 39 

— of the potentials, 38 

— transformation, 38, 147 
Gauss—Ostrogradsky theorem, 4, 14, 363 
Generalized force, 329 

— momentum, 329 

Gradient of scalar field, 358 
Group velocity, 154 


Half-width of emitted line, 128 
Hamiltonian of a particle in an electro- 
magnetic field, 330 
— operator, 359, 362, 367, 372 
Hamilton’s equations for field oscillators, 
167 i 
Helicity, 152 


Eoi E - 


SUBJECT INDEX 391 


Homogeneity of space, 225 


Imaginary time, 244 

Impact parameter, 199 

—— and scattering angle, 201 

Induction accelerator, 338 

—, Faraday law of, 35 

-, Faraday law of (integral form), 24, 26 

— zone, 139 

Inertial reference frame, 217 

Initial conditions for electromagnetic po- 
tentials, 86 

Interacting particles, angular momentum 
of, 349 

——, energy of, 347 

——, Lagrangian for, 346 

——, mas; of, 348 

——, momentum of, 348 

——, potential of, 342 

—-—, system of two, 350 

Interaction, electromagnetic, 10 

— energy, generalized, 346 

— force between two charges, 300 

—, interparticle, relativistic, 269, 274 

— of charges, 11 

—, particles in electromagnetic, 340 

— potential, experimental determination 

of, 203 
—, principle of the limiting velocity of 
propagation of, 223 

—, retarded, 340 

Interval between events, 239 

—, space-like, 240 

—, time-like, 240 

Invariance, electromagnetic field, 307 

— in Minkowski space, 245 

— of physical laws under a rotation, 242 

— of the interval, 239 

—, phase, 307 

—, relativistic, 222 

Invariant of four-tensor, 255 

— physical law, 217 

Ion, 11 

Irradiated energy in collision process, 212 

Irrotational vector, 360, 366 

Ives’ and Stillwell’s experiment, 311 


Laboratory system, 206 

—— and centre-of-mass system, 208 

——, cross section in the, 208 

Lagrange equation for charged particle, 
328, 329 

—— for charged particle in electromag- 

netic field, 187 

——, relativistic, 267 

Lagrangian for interacting particles, 346 

— of a particle in an electromagnetic 

field, 329 

— ofa system of charges, 341 

—, relativistic, 267 

Lamé coefficients, 371 

Laplacian, 5, 374 

Larmor frequency, 177 

Law of motion, 217 

Legendre polynomial expansion of the 
delta-function, 385 

Lenz rule, 25 

Lienard—Wiechert -four-potential, 105, 
302, 350 

Light pressure, 47 

—, velocity of, 20 

Line-width, radiation reaction and, 125 

Linear radiator, 140 

Long-range action, 13 

Longitudinal Doppler effect, 310 

Lorentz and Galilean transformation, 227 

— condition, 93, 111 

— contraction, 229 

——, experimental investigation of, 321 

— force, 20, 29, 121, 326 

—— in terms of electromagnetic poten- 

tial, 328 

— friction force, 126 

— gauge, 39 

— invariant relations, 242 

— relation, 36 

— spectral distribution function, 126 

— transformation, 226 

—~—, charge density and, 293 

——, graphical representation of, 247 

—— of potentials, 295 


Magnetic dipole radiation, 131, 132 * 





392 SUBJECT INDEX 


Magnetic field,20, 112 

—— and quasistationary motion, 75, 81 

—— and slow uniform motion, 78 

——, deflection of particles in a, 338 

——, determination of a, 22, 26 

——, nonuniform, charged particle in, 184 

—— of moving charge, 299 

——, relativistic, 305 

—— strength, 20 

——, symmetry between electric and, 33 

——, time dependent, 24 

——, trajectory of charge in, 337 

——, uniform, charged particle in, 176 

——, varying in time, uniform, charged 
particle in, 182 

—— vector lines, 21 

Magnetic moment, 81 

—— and the origin, 82 

—— and closed current, 83 

——, conservation of, 183, 185 

Magnetomotive force, 20, 25 

Mass and energy, equivalence of, 264 

— defect, 277 

— energy relation, 284 

— of interacting particles, 348 

—, relativistic, 261 

——, conservation, 265 

—, rest, 256, 261 

——, zero, 261 

Material point, 256 

Maxwell equations, 28 

—— for static problems, 49 

—-— for the electrostatic field, 15 

——, quasistationary, 71, 73 

— Lorentz equations, 30, 294, 305 

Meson decay, 279 

Michelson’s experiment, 221, 230 

Minkowski force, 257 

— space, 244 

—- , invariance in, 245 

——, rotation transformation in, 246 

Microparticles, 9 

Microsystems, 9 

Mirror, magnetic, 186 

Momentum conservation law, 45, 265, 

283 


— density of the electromagnetic field, 
46 

—, generalized, 189 

— loss of emitting particle, 117 

— ofa plane wave, 144 

— of interacting particles, 348 

Mossbauer effect, 312 

Motion for charged particle, equations of, 
328 

Moving clock, slowing down of the run- 
ning of, 231 

— particle, radiation emitted by a, 315 

— scale, contraction of a, 233 

Multiplication, vector, 356 

Mu-meson (muon) decay, 232, 281 

—-—, energy, 281 


Neutral particle, 10 

— system, electrically, 55, 111 
Neutrino, 280 

Newtonian law of inertia, 256 
Normalization cubes, 164 
Nuclear binding energy, 277 

— reaction, energy of, 278 
Nucleon collision, electron—, 353 


Oersted, law of, 21 
Orthogonality, condition of, 249 


Parseval relation, 377 

Period of motion, 71 

Phase invariance, 307 

— of plane wave, 151 

— shift, 158 

— velocity, 144, 154 

Photon-electron collision, 289 

“Picture”, 322 

Pit-meson decay, 280 

Pi? -meson decay, 282 

Plane polarization, 152 

Plane wave, amplitude of, 151 

——, electromagnetic field strength in the, 
146 

——, momentum of a, 144 

——, phase of, 151 

——, Poynting vector in a, 144 


SUBJECT INDEX 393 


Point charges, electric field of a system 
of, 52 

——, moving, 102 

——, system of, 52 

Poisson’s equation, 5, 50, 74 

Polar coordinates, cylindrical, 372 

——, spherical, 372 

— vector, 23, 358 

Polarization, circular, 152 

— direction, 152 

—, elliptical, 152 

—, plane, 152 

Positron—electron annihilation, 284 

Potential, 360 

— energy, effective, 192 

—— of electrostatic field, 63 

—— of system of charges, 63 

—, Lorentz transformation of, 295 

— of interacting particles, 342 

—, retarded, 91 

— vector, 360 

—, —, 7, 34,91, 131 

Poynting vector, 43 

—— ina plane wave, 144 

—— of emitting system, 115 

Principal value, 386 

Production, pair, electron—positron, 283 

Propagating electromagnetic wave, 143 

Propagation of interaction, principle of 
the limiting velocity of, 223 

Proper delay time, 108 

— ‘retardation, 107, 138 

— time, 231, 241 

Proton—proton collision, 285 

Pseudo-vector, 23, 358 


Quadrupole, 58 

— electric field, 61 

— moment, 58 

—— and symmetry, 62 
—— density, 58 

—— of atomic nuclei, 62 
—-—, sign of the, 61 

—— tensor, 60 

— radiation, 129, 131 
——, intensity of, 132 


Quasistationary condition, 72 

— Maxwell equations, 71, 73 

— motion, 69 

——, current density for, 70 

--, magnetic field and, 75, 81 
——, vector potential and, 74, 80 


Radiated power, total, 116, 117 

——, transformation of, 317 

Radiation, dipole, 116 

— emission, 44, 112 

— emitted by a moving particle, 315 

— energy, 350 

—— and momentum, 116 

—, energy flux of, 117 

— from a system of two charges, 120,195 

— of a beam of particles, effective, 354 

— of moving charge, 351 

— power of oscillating charge, 119 

— pressure, 47 

— reaction, 121,124 

—— and line-width, 125 

——, order of magnitude of, 123 

— recording, 322 

—, scattered, 158 

—, scattering cross section for unpolarized 

159 

— zone, 114, 139 

Radioactivity, 275 

Radius, classical, of electron, 124, 266 

— vector, four-dimensional, 249 

Rayleigh scattering, 161 

Recoil shift, 312 

Reduced mass, 84, 191 

Reference frame, 216 

——, inertial, 217 

Relative motion of interacting particles, 
192 

— velocity, 84 

Relativistic angular transformations, 236 

— dynamics, equations of, 257 

— electric field, 304 

— invariance, 222, 244 

— Lagrange equations, 267 

— magnetic field, 305 = 

— mass, 261 





apee 


394 SUBJECT INDEX 


Relativistic mass, conservation of, 265 

— particles, collisions between, 270, 286 

—— in uniform electric field, 331 

— system of particles, 269 

Relativistically invariant relations, 242 

Relativity of spatial extension, 228 

—, special theory of, postulates of, 222 

Resonance fluorescence, 160 

Rest energy, 264 

— mass, 256, 261, 273 

—— ofa system of particles, 274 

——, zero, 261 

Retardation, proper, 107 

—, scalar potential and, 344 

—, total, 342 

—, vector potential and, 345 

Retarded argument, 113 

— interaction, 340 

— potential, 91, 100, 102 

——, boundary condition for, 93 

——, dipole approximation of, 106 

Rigid body, 230 

Rotation, invariance of physical laws un- 
der, 242 

— transformation in Minkowski space, 

246 
Rutherford formula, 203 


Scalar field, 358 

— potential, 5,50, 90 

—— and current density, 110 

—— and retardation, 344 

——, boundary condition for, 51 

——, determination of, 37 

——, equation for, 35 

— product, 250 

Scale, 228 

Scattered particle, energy lost by, 209 

— radiation, 158 

Scattering angle, 199 

——, impact parameter and, 201 

—, coherent, 157 

—, Compton, 289 

— cross section for unpolarized radiation, 
159 r 

— of the electromagnetic wave, 157 

— process, 197 


Self energy of a point charge, 68 

Separation constant, 166 

Shift, frequency, due to thermal motion, 
313 

Short-range action, 13 

Sign function, 386 

Signal, 224 

—, propagation of, 224 

— „transmission, 236 

Simultaneity, 237 

—, absolute, 237 

Sinks, 364 

Slow uniform motion, electric and mag- 
netic field, 77,78 

Solenoidal vector, 366 

—— field, 6, 364 

Sources, 364 

Space-like interval, 240 

Space-like vector, 251 

Spatial extension, relativity of, 228 

Special theory of relativity, postulates of, 
222 

Spectral decomposition, 135 

— distribution functions, Lorentz, ł28 

—— of emitted energy, 137 

Spherical polar coordinates, 372 

— system of coordinates, 114 

— wave, 114 

——, converging, 90 

——, diverging, 90 

Spontaneous particle decay, 276 

Stability of atomic nuclei, 276 

Static problems, Maxwell's equations for, 
49 

Stokes’ theorem, 4, 366 

Superposition, principle of, 87 

—— of electromagnetic fields, 29 

— property, 12 

Surface integral, 361, 363 

Symmetry between electric and magnetic 
field, 33 

Synchrotron, 338 

System of particles, rest mass of a, 274 


Target, 197 
Tensor, antisymmetric, purely, 254 
—, field, electromagnetic, 305 


E. 


2 Ses ee 


SUBJECT INDEX 395 


Tensor (continued) 

—, four-, 253 

—, second-rank four-, 253 

—, transformation of four-, 254 

Test charge, 12 

——, moving, 20 

— particle motion, 174 

Thermal motion, frequency shift due to, 

313 

Thomson atomic model, 119 

— scattering, 161 

Threshold kinetic energy, 283 

Time, imaginary, 244 

— -like interval, 240 

— -like vector, 251 

—, proper, 231, 241 

— reversal invariance, 92 

Trajectory, 217 

—, charge in magnetic field, 337 

—, charge in transverse electric field, 333 

— of interacting particles, 194 

—, particle in electric field, 175 

—, particle in electromagnetic field, 181 

—, particle in magnetic field, 177 

Transformation, invariance, classical, 243 

— ofa force, 326 

— of an electromagnetic wave energy, 
317 

— of an element of solid angle, 316 

— of radiated power, 317 

— of the total scattering cross section, 
319 

Transverse Doppler effect, 311 

— waves, 166 

Trochoid, 181 


Ultrarelativistic motion, 268 
Uncertainty relations, 155 


Vector algebra, 356 
— analysis, 3, 356 


—, axial, 23, 358 

—, circulation, 3, 360 
—, curl of a, 365 
— field, 3, 362 

——, determination of a, 4, 8 

——, solenoidal, 6, 364 

—-—, vortex-free part of a, 6 

— flux, 3, 362 

—, irrotational, 360, 366 

— lines, 12 

— multiplication, 356 

—, polar, 23, 358 

— potential, 7, 34,91, 131 

—, —, 360 

—— and current density, 110 

—— and quasistationary motion, 74, 80 
—— and retardation, 345 

——, boundary condition for, 95 
—~—, condition on the, 8 

——, determination of, 37 

——, equation for, 35, 94 

——, Fourier analysis of, 170 

——, Fourier transformation of the, 96 
—— of constant uniform field, 76 
—, pseudo-, 23, 358 

—, solenoidal, 366 

—, space-like, 251 

—, time-like, 251 

Velocity, four, 251 

— of centre of mass, 273, 349 
Vortex-free part of a vector field, 6 


Wave equation, 37, 88 

— number, 135 

— packet, 152 

——, dimensions of, 154 

— process, 90 

— vector, 135, 143 

——, four-dimensional, 308 
Wavelength, Compton, 124, 290 


| 
i 
| 
( 
| | 
H 


re ore | 





Pn "N 








Theoretical Physics 


Volume 2 









Translated from the Russian by 
S. Subotić, Belgrade 


Translation edited by 
J. Schneps, Tufts University, Medford, Mass. 
A. J. Manuel, Leeds University 








Theoretical Physics 


An Advanced Text 


Volume 2: 


STATISTICAL PHYSICS 
ELECTROMAGNETIC PROCESSES IN MATTER 


Benjamin G. Levich 


Institute of Electrochemistry 
Academy of Sciences of the USSR, Moscow 


ASC 


1971 
North-Holland Publishing Company — Amsterdam - London 





142636 











© 


l 
4 
4 


„T a aaa E 


Te aia i i 


© NORTH-HOLLAND PUBLISHING COMPANY, AMSTERDAM, 1971 


All rights reserved. No part of this book may be reproduced, stored in a retrieval 
system, or transmitted, in any form or by any means, electronic, mechanical, photo- 
copying, recording or otherwise without the prior permission of the Copyright owner. 


Library of Congress Catalog Card Number: 68 54501 


ISBN complete set: 0 7204 0176 3 
Vol. 2: 0 7204 0178 X 


Printed in The Netherlands 


Title of the Russian edition: 
KURS TEORETICHESKOJ FIZIKI 


Russian edition published by: 
IZDATELSTVO ‘NAUKA’, GLAVNAJA REDAKCIJA, 


FIZIKO-MATEMATIČESKOJ LITERATURY (MOSKVA, 1969) 


Publishers: 
NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM 


Sole Distributors for the Western Hemisphere: 
WILEY INTERSCIENCE DIVISION 
JOHN WILEY & SONS, INC. - NEW YORK 





FOREWORD 


The first Russian edition of ‘Theoretical Physics’, which appeared in 1962, 
has been widely used as a textbook. 

Numerous comments from colleagues, lecturers and students have been 
taken into account in preparing this new edition, which is the first one in 
English and which will also appear as the second Russian edition. 

The material has now been divided into 4 volumes covering the following 
subjects 


Volume 1 
PartI Theory of the Electromagnetic Field 
Part II Theory of Relativity 


Volume 2 
Part III Statistical Physics 
Part IV Electromagnetic Processes in Matter 


Volume 3 
Part V Quantum Mechanics 


Volume 4 
Part VI Quantum Statistics and Physical Kinetics 


The rapid development of physics and the present wide interest in 
non-equilibrium and non-stationary processes has compelled us to expand the 
section on physical kinetics. It has also been transferred to the end of 
Volume 4 as it is practically impossible to expound this topic without using 
quantum mechanics. 

Part IV — ‘Electromagnetic Processes in Matter’ — has been substantially 
revised. Interest in this field has increased recently, mainly in connection with 
the study of plasmas and plasma-like media, which now have sections devoted 
to them. 





vi FOREWORD 


The methods of calculating electrostatic and direct-current fields, and 
other problems of classical electrodynamics in a medium, are covered very 
briefly as we have assumed that students will be able to consult the many 
monographs and handbooks on general physics, electrical- and radio- 
technology, and the equations of mathematical physics. 

As for other modifications and additions, we should draw attention to the 
introduction of tensor notation, to new ideas in the theories of relativity and 
electromagnetic fields, the broadening of the introduction to the theory of 
probability, a brief presentation of the method of correlation functions in 
statistical physics, the exposition of the thermodynamic theory of ferro- 
magnetism and the theory of propagation of electromagnetic waves in plasma. 
A number of paragraphs have been rewritten. We have tried to bring the 
content of the book even closer to the interests of present-day theoretical 
physics. 

The general level of the book has been preserved and it is still intended to 
form an introduction to theoretical physics. Problems requiring the use of 
cumbersome or special mathematical apparatus are still excluded, and the 
most difficult sections are marked by an asterisk. These may be skipped at 
will, since there is no reference to them in the main text. 

In conclusion we would like to express our gratitude to all those who 
helped us in preparing this book, in particular to A.M. Brodsky, A.M. 
Golovin, B.M. Grafov, R.R. Dogonadze, V.S. Krylov and especially V.S. 
Markin and V.V. Tolmachev. I.V. Savelyev discovered a number of misprints 
which have now been corrected. 

L.D. Konkina helped us in editing the manuscript. 

We are grateful to the readers and students who used the first Russian 
edition of the book for sending us their valuable comments which have been 
taken into account in this edition. 


August 1970 


a a ——e——EEEE EE ES SEE 


= 


FOREWORD TO THE FIRST RUSSIAN EDITION 


The continuous development of theoretical physics and the regular 
expansion of its areas of application create increasing demand for textbooks 
and manuals. 

The rapid development and the complexity of the most recent experi- 
mental methods of physical investigation, and the corresponding development 
and extension of the mathematical apparatus of theoretical physics, have 
meant that one man usually cannot combine the two methods of investiga- 
tion. The end of the 19th century and particularly the 20th century therefore 
saw physicists divided into ‘experimentalists’ and ‘theoreticians’, the latter 
studying physical laws by means of the mathematical methods of theoretical 
physics. 

Obviously, a background in theoretical physics is essential in the education 
of experimental as well as theoretical physicists. 

The experimental and theoretical methods of physical investigation have 
penetrated into a number of branches of science related to physics (physical 
chemistry, biophysics, geophysics, astrophysics, and so on) and into technolo- 
gy (metal physics and metallurgical science, thermophysics, electrical technol- 
ogy, radiotechnology, computation, the instrument-making industry etc.). 
Workers in these branches of science and technology also need a certain 
minimum knowledge of theoretical physics. 

The compilation of a modern textbook on theoretical physics is inevitably 
associated with certain logical and methodological difficulties. It is impossible 
at present to divide theoretical physics into classical and quantum parts so 
that it is also impossible to divide it into separate chapters and sections. For 
example, the exposition of statistical physics without taking into account the 
quantum properties of atomic systems is impossible, for it would mean that 
the general theory remained without practical application. In the theory of 
electromagnetic processes in matter one has of necessity to make use of the 
ideas of statistical physics, and so on. It may be that the maximum 
consistency of composition would be obtained if the book were founded on 


vii 





SE MM 


viii FOREWORD TO THE FIRST RUSSIAN EDITION 


quantum mechanics but this is completely inadmissible in a book intended as 
an introductory treatise. Quantum mechanics requires a certain preparedness 
and the student must be convinced of the necessity of renouncing obvious 
classical representations. Compromise solutions, which have justified them- 
selves during many years of teaching theoretical physics at the Moscow 
Engineering-Physical Institute and Moscow State University, are therefore 
inevitable. 

The following general principles have been applied. 

(1) The book is written as an introduction to theoretical physics so that 
aspects requiring the use of cumbersome or special mathematical apparatus 
have not been included. 

(2) As it is to be used for a systematic study of the subject the course is a 
unique whole and all material necessary for understanding the later sections is 
contained in the earlier ones. 

(3) It would not be feasible to elucidate experimental facts in addition to 
problems concerning purely theoretical physics. However, physics is a single 
science, and an attempt to expound the theoretical aspects without taking 
experiment into account would be quite wrong. The reader is assumed to 
have some basic experimental knowledge from university courses in general 
and atomic physics so that we have confined ourselves to references and, in a 
few instances, to a schematic description of basic experiments. 

(4) The acquaintance assumed with general courses in general and atomic 
physics has allowed us to rely on a certain (very restricted) knowledge of 
quantum mechanics in our treatment of statistical physics. 

(5) Classical mechanics usually forms a separate course so that this topic 
has been omitted although detailed reference has been made to handbooks of 
mechanics. 

(6) The book similarly does not cover hydrodynamics, aerodynamics, the 
theory of heat transfer, or problems related to electrical- and radio- 
technology. 

(7) Detailed reference is made to mathematical manuals. The mathematical 
apparatus utilized, except in the sections marked by an asterisk, is covered by 
the usual courses in analysis. In the case of quantum mechanics, however, the 
mathematical apparatus has been included, since it is of a specific character 
and is not taught in traditional mathematical courses. 

(8) As the book is intended as a systematic course in theoretical physics no 
attempt has been made to achieve the same level of accessibility in all 
sections. It is a well-known fact that a student’s comprehensi n and 
assimilation of difficult material increases as a course progresses, and that this 
is also true for the associated mathematical apparatus. Moreover, experi- 


FOREWORD TO THE FIRST RUSSIAN EDITION ix 


mental physicists will constantly encounter new problems in quantum 
mechanics which can only be handled using advanced methods of treatment. 
The section on quantum mechanics (Part V) therefore deals with some topics 
having a more advanced character than those in other sections. The analysis 
of applications of the kinetic equations is similarly treated rather extensively. 


The uniqueness of the book’s objectives has affected the content of individual 
sections, so that some topics in modern physics have been included at the 
expense of more traditional material. 

Part I contains the foundations of the theory of the electromagnetic field 
in a vacuum, based on the system of Maxwell-Lorentz equations. A basic 
knowledge of electromagnetism is assumed. The focus of attention is the 
theory of radiation and the motion of charged particles in external fields. 

In Part II, devoted to the theory of relativity, a four-dimensional form of 
representation is adopted which not only corresponds to the spirit of the 
theory but also predominates in contemporary literature. The problems of 
dynamics in the theory of relativity are treated in some detail. A number of 
the most recent applications of the theory of relativity, particularly those 
related to nuclear physics, are covered here for the first time in a textbook. 

Part III is a revised version of Levich’s ‘Introduction to Statistical Physics’ 
and treats statistical physics and the fundamentals of statistical thermo- 
dynamics. Classical thermodynamics would require too much space, and did 
not seem indispensable. 

Part IV contains the theory of electromagnetic processes in matter. 
Relatively little attention is paid to problems in theoretical electrical- and 
radio-technology. The phenomenological theory of electric and magnetic 
properties of matter is analyzed in some detail, and the notion of the physics 
of the plasma state of matter is given. 

In Part V the basic ideas of present-day relativistic quantum mechanics are 
included as well as the traditional problems of non-relativistic quantum 
mechanics. Applications to solid-state theory are considered at length. 

Part VI contains the essential concepts of physical kinetics, which are not 
usually presented in a general course on theoretical physics. 


The experience of teaching theoretical physics shows that the greatest 
difficulties are often encountered not in understanding new physical ideas but 
in the actual mathematical treatments. All mathematical operations have 
therefore been performed in sufficient detail. 

For convenience we have presented a brief derivation of those formulae of 











x FOREWORD TO THE FIRST RUSSIAN EDITION 


vector analysis which are encountered throughout, as well as the necessary 
data on Fourier integrals and 6-function theory. 

The numbering of formulae and sections starts afresh in each Part and 
references to appendices have been given Roman numerals. 

The author hopes that the readers, after making themselves familiar with 
the foundations of theoretical physics expounded in this book, will be able to 
proceed to a more profound study using the many-volume treatise of Landau 
and Lifshitz. The scientific and educational ideas of their work were of great 
influence on the author, who is a disciple of Landau. 

Parts I—IV and Part VI were written by B.G. Levich. Part V was written by 
Y.A. Vdovin and V.A. Myamlin under the general scientific guidance of B.G. 
Levich. Chapter XV * of Part V was written by A.I. Naumov. 

The author expresses his gratitude to the colleagues who read the book 
and the manuscripts, and made a number of valuable remarks: B.M. Grafov, 
R.R. Dogonadze, V.A. Kiryanoy, V.S. Krylov, V.S. Markin, V.P. Smilga, Y.A. 
Chizmadzhev and Y.I. Yalamov. 

The creation of a textbook on theoretical physics sufficiently comprehen- 
sive in content and clear in presentation is a very complex task. The author is 
therefore conscious of the fact that shortcomings and errors will be discover- 
ed and would be grateful to receive an account of them which can be taken 
into consideration in the next edition of the book. 


1962 


* Chapter 13 of the English edition. 





Theoretical Physics: 
Outline of Vols. 1—4 


Volume 1 


Part I Theory of the Electromagnetic Field 
Chapter 1 General theory of the electromagnetic field 
2 The electrostatic field 

The quasistationary magnetic field 


The electromagnetic field of arbitrarily moving charges 
Radiation theory 


DAnf we 


Electromagnetic field in a vacuum and electromagnetic wave 
“scattering 
7 The motion of particles in electromagnetic fields 


Part II Theory of Relativity 


Chapter 1 General principles of the theory of relativity 
Relativistic mechanics 
Relativistic electrodynamics 


w N 


Appendix I, IH and IH 


Subject index 


Volume2 (for details see p. xv) 


Part II Statistical Physics 


Chapter 1 The basic concepts of the theory of probability 
2 The kinetic theory of gases 


xi 








d 
E 
E 
~ 
4 
J 


xii 

H 3 
| 4 
} 5 
y 6 
i s 
H 9 
k| 10 

| Part IV 
|| Chapter 1 
2 
3 
4 
5 
6 





OUTLINE OF VOLUMES 1—4 


Statistical distribution 

Statistical and phenomenological thermodynamics 

Ideal gases 

Systems of interacting particles 

Crystals 

The theory of fluctuations 

Systems with a variable number of particles 

Statistical distributions in quantum statistics and some of their 
applications 


Electromagnetic Processes in Matter 


Electromagnetic fields in matter 

Electrostatics 

Direct electric current and the magnetic properties of matter 
Quasistationary electromagnetic fields 

High-frequency fields 

Matter in the plasma state 


Appendix IV 


Subject index 


Volume 3 


Part V 
Chapter 1 


OMBAIDUNAARWNY 


Quantum Mechanics 


The basic concepts of quantum mechanics 

The Schrodinger equation 

The mathematical apparatus of quantum mechanics 
Motion in a centrally symmetric field 

The quasi-classical approximation 

The matrix form of quantum mechanics 
Perturbation theory 

Spin and identity of particles 

Applications of quantum mechanics to the consideration of the 
properties of atomic and nuclear systems 

The theory of diatomic molecules 

Scattering theory 





172 
13 
14 
1S 


OUTLINE OF VOLUMES 1-4 xiii 


The method of second-quantization and radiation theory 
Relativistic quantum mechanics 

Some problems of quantum electrodynamics 
Fundamentals of the theory of elementary particles 


Subject index 


Volume 4 


Part VI 


Chapter 1 
2 


3 
4 
5 


Quantum Statistics and Physical Kinetics 


Quantum statistics 

Physical kinetics 

Kinetic theory of gases and gas-like systems 

Time correlation function method and Onsager’s theory 
Solid-state theory 


Subject index 





Contents of Volume 


Part II Statistical physics 


Chapter 1 The basic concepts of the theory of probability 


§ 1 


NAUN 


Problems of statistical physics. Necessary results from clas- 
sical and quantum mechanics 

Basic ideas of the theory of probability 

Mean values and fluctuations 

Normal distribution and moments 

The correlation function 


Chapter 2 The kinetic theory of gases 


9 
10 
11 
12 


The simplest statistical system — an ideal gas 

The Maxwell distribution 

Collisions of molecules with the wall of the container. Pres- 
sure. The connection of the parameter a with the absolute 
temperature 

Properties of the Maxwell distribution 

The calculation of characteristic quantities 

Collisions of molecules with each other 

The mean free path 


Chapter 3 Statistical distribution 


13 
14 
15 
16 
17 
18 


Quasi-independent systems 

Statistical distribution 

The probability of a state of a system 

The Gibbs distribution 

The statistical temperature 

The properties of the Gibbs distribution and statistical 
equilibrium 


XV 


19 
25 
31 
35 


39 
43 


47 
51 
55 
59 
62 


66 
68 
70 
75 
82 


84 


o es A a Iul 


xvi 


§19 
20 


CONTENTS 


Transition to classical statistics 
The monatomic gas as a whole 


Chapter 4 Statistical and phenomenological thermodynamics 


21 
22 
2 
| 24 
| 25 


26 
i 27 


28 
29 
30 
31 
32 
33 
34 


35 
36 


37 
38 


39 





The internal energy of a macroscopic system. The first and 
second laws of thermodynamics 
Work and pressure 


3 The change in the energy of a system in the general case of 


a quasi-static process 

Entropy and the basic thermodynamic equality 

The law of increase of entropy 

The basic thermodynamic inequality 

The maximum work to be obtained from thermal processes. 
The impossibility of constructing a perpetual motion ma+ 
chine of the second kind. Phenomenological definition of 
the entropy 

The maximum work in non-cyclic processes and the thermo- 
dynamic potentials 

Properties of the thermodynamic potentials 

Some thermodynamic relations 

Methods for the transformation of thermodynamic quanti- 


ties 


The determination of thermodynamic quantities by the meth- 


ods of statistical physics 

The determination of thermodynamic quantities from ex- 
perimental data 

Throttling 

The third law of thermodynamics 

The statistical character of the second law of thermodyns 
amics 


Chapter 5 Ideal gases 


The distribution function for ideal gases 

The Maxwell—Boltzmann distribution and the Boltzmann 
distribution in a uniform field of force 

The calculation of the heat capacity of diatomic molecules 
by means of classical statistics. The law of equipartition of 
energy over degrees of freedom 


86 
91 


97 
101 


103 
108 


111 
116 


118 
124 
127 
130 
132 
137 
141 
144 
147 


153 


164 


174 


179 





§40 
41 
42 
43 
44 


45 





CONTENTS 


The thermodynamic functions of a system which can be in 
two quantum states 

Diatomic molecules 

Thermodynamic functions of diatomic gases 

The vibrational partition function and the contribution of 
vibrations to the energy and heat capacity 

The rotational partition function and the contribution of 
rotation to thermodynamic functions 

Polyatomic molecules 


Chapter 6 Systems of interacting particles 


46 
47 


48 
49 


Interaction between molecules in non-ideal gases 

The correlation function method and its application to the 
theory of dense gases and liquids 

Equations for correlation functions 

The equation of state and the energy of a system of inter- 
acting particles 


Chapter 7 Crystals 


50 


51 
52 
53 
54 


Crystal structure and thermal motion in the one- dimensional 
crystal model 

Long waves in a three-dimensional crystal 

The partition function of a crystal 

The thermodynamic functions of a crystal 

Comparison of theory with experiment 


Chapter 8 The theory of fluctuations 


55 
56 
57 


58 


Small fluctuations in macroscopic systems 

Brownian motion 

Fluctuations of thermodynamic quantities in a homogeneous 
system 

Effect of fluctuations on the sensitivity of measuring devices 


Chapter 9 Systems with a variable number of particles 


59 
60 


61 


The Gibbs grand canonical distribution 

The basic thermodynamic equality and the calculation of 
chemical potentials 

Conditions of phase equilibrium 


xvii 


190 


242 
256 
260 
263 
265 


270 
276 


284 
290 


294 


301 
305 





j xviii 
§62 


! 63 
64 
65 
66 
67 
68 
69 


70 


71 
72 
73 
74 
75 
76 
77 
78 
79 
80 


§ 1 
2 
3 





CONTENTS 


The equation of the phase equilibrium curve. Equilibrium 
between the vapour and the condensed phase 

Theory of phase transitions 

Phase equilibrium curves 

Surface tension and surface pressure 

Gas adsorption 

Chemical equilibrium in the gas phase 

The law of mass action 

Thermal dissociation of atoms 


Chapter 10 Statistical distributions in quantum statistics and some 


of their applications 


The identity of elementary particles and the calculation of 
the partition function 

A second method of deriving the statistical distribution 
Quantum distributions for an ideal gas 

Black-body radiation 

The classical theory of black-body radiation 

The Planck formula 

The statistics of the photon gas 

The properties of liquid helium II 

Statistical theory of liquid helium II 

The electron gas at absolute zero 

The electron gas at low temperatures 


Part IV Electromagnetic processes in matter 


Chapter 1 Electromagnetic fields in matter 


The derivation of the basic field equations 

The polarization of a medium in an electric field 

The mean current density and mean charge density in a 
medium 

The system of equations for the electromagnetic field in a 
medium 

Boundary conditions 


343 
345 
348 
356 
360 
362 
365 
371 
375 
381 
385 


397 
400 


403 


409 
413 


§6 


7 


CONTENTS 


The limits of applicability of the system of constitutive 
equations 
The law of conservation of energy in a medium 


Chapter 2 Electrostatics 


8 
9 
10 
11 
12 
13 


The electrostatic field 

The solution of electrostatic problems 

The method of images and method of inversion 

The energy of a system of conductors 

Dielectrics and conductors in an external electrostatic field 
The thermodynamic potentials of a dielectric and the dielec- 
tric susceptibility 


Chapter 3 Direct electric current and the magnetic properties of 


14 
15 
16 
17 
18 


19 
20 
21 


matter 


Ohm’s law 

A linear conductor carrying a constant current 

Direct current in a conducting medium 

The magnetic fields of direct currents. The Biot—Savart law 
The magnetization of magnetic materials and the magnetic 
moment 

Paramagnetic susceptibility 

Ferromagnetism. Spontaneous magnetization and hysteresis 
Superconductivity 


Chapter 4 Quasistationary electromagnetic fields 


22 
23 
24 
25 
26 
27 


28 


Conditions of quasistationarity 

The law of induction in moving conductors and media 
Maxwell’s equations for quasistationary fields in integral 
form and their integration for the case of linear conductors 
The energy of the magnetic field of a system of quasistation- 
ary currents 

Coefficients of self-inductance and mutual inductance for 
non-linear conductors 

Lagrange’s equations for a system of quasistationary cur- 
rents 

The generalized ponderomotive forces in a system with 
moving current loops 


xix 


416 
419 


422 
427 
430 
435 
437 


441 


452 
455 
457 
460 
464 
471 


476 
485 


493 
496 


499 


505 


508 


512 


517 





XX 


§29 
30 


CONTENTS 


Fluctuations in conductors and the Nyquist formula 
Skin effect 


Chapter 5 High-frequency fields 


31 
32 
33 


34 
35 
f 36 
37 


38 
39 


Electromagnetic waves in a homogeneous isotropic medium 
Dispersion relations (the Kramers—Kronig formulae) 

The electromagnetic field in a medium with spatial and time 
dispersion 

The dispersion of light 

Geometrical optics 

Diffraction 

The reflection and refraction of electromagnetic waves at 
the boundary between media 

Wave-guides 

The passage of fast particles through matter 


Chapter 6 Matter in the plasma state 


40 

41 

42 

43 

44 

45 
L 46 
hy: 


The general characteristics of plasma 
Equilibrium plasma 

Plasma in static electric and magnetic fields 
Magnetic isolation and the pinch effect 
The magnetic field in a moving plasma 
Magnetohydrodynamic waves 

Plasma in a high-frequency electric field 


Appendix IV Important integrals. Stirling’s formula 


Subject index 





$23 
$28 


533 
540 


544 
549 
555) 
560 


570 
577 
585 


595 
597 
605 
610 
613 
617 
622 


629 


633 


PART III 


STATISTICAL PHYSICS 











The Basic Concepts of the 


Theory of Probability 


§1. Problems of statistical physics. Necessary results from classical and 
quantum mechanics 


In this part we shall acquaint ourselves with the fundamentals of the 
atomic theory of macroscopic bodies. By macroscopic bodies we shall under- 
stand systems made up of a very large number of particles. It is usual to 
divide the atomic theory of macroscopic bodies into two parts — statistical 
physics and physical kinetics. 

In statistical physics one restricts oneself to the treatment of the proper- 
ties of macroscopic systems whose states do not change in time. The states 
in which a macroscopic system can remain for an indefinitely long time are 
called equilibrium states. Hence it can be said that the problem of statistical 
physics (sometimes called statistical mechanics or physical statistics) is the 
investigation of the properties and behaviour of macroscopic systems in a 
state of equilibrium on the basis of the known properties of the particles 
constituting them. 2 

Macroscopic bodies can be made up of elementary particles — electrons, 
protons, neutrons and so on, or their combinations — nuclei, atoms and 
molecules. For brevity we shall call all the particles, i.e. molecules, atoms, 
electrons, protons etc., microparticles. The major constituents of bodies in 


1 


THEORY OF PROBABILITY Ch. 1 


N 


ordinary physical conditions are the atoms or molecules representing their 
structural units. Only in a high-temperature plasma (see Part IV) does one 
need to take into account the possibility of the dissociation of atoms into 
electrons and nuclei. 

In statistical physics the properties and the laws of motion of the ele- 
mentary particles, atoms and molecules, are considered known. The prob- 
lem consists of finding the behaviour of systems containing very large num- 
bers of particles with known properties. 

The investigation of the properties of such macroscopic systems by the 
methods of statistical physics allows one to show an essential feature of such 
systems. This is that the behaviour of macroscopic systems is determined by 
laws of a special type, statistical laws. 

It turns out that the general equilibrium properties of systems depend re- 
latively little on the actual properties of the particles constituting the bodies 
and their laws of interaction. Hence in statistical physics one succeeds in 
establishing the general laws of behaviour of all macroscopic bodies in a state 
of equilibrium. In particular, statistical physics allows one to find the universal 
laws of thermal behaviour of macroscopic bodies (the laws of thermody- 
namics). However, the application of a number of general relations of statis- 
tical physics to concrete systems requires a certain amount of information, 
albeit very restricted, about the laws determining the behaviour of atomic 
systems. 

We have stressed more than once that classical physics turns out to be 
inapplicable in the realm of atomic phenomena. Hence the application of the- 
laws of statistical physics to real systems is, in practice, impossible if one 
tries to restrict oneself to the classical concepts of the motion of atomic 
particles. Therefore we shall be obliged to present a certain amount of data 
from quantum mechanics. 

In addition to establishing the equilibrium properties of macroscopic 
bodies it is also of great interest in physics to find the behaviour of bodies 
whose states change in time. The study of the non-equilibrium properties of 
macroscopic systems is the problem of physical kinetics. It is clear that the 
laws of change of state of macroscopic systems — the laws of physical kine- 
tics — are substantially more complex than the laws determining the beha- 
viour of equilibrium systems. In physical kinetics it has not been possible to 
obtain rigorous laws for the change of state of systems in time which have 
universal validity. Hence at present there exist only laws for the behaviour 
of non-equilibrium systems of the simplest character. 

We now turn to the presentation of the results required from classical and 
quantum mechanics. For a clear description of the behaviour of mechanical 


§1 PROBLEMS OF STATISTICAL PHYSICS 3 
systems, the motion of which is described by Hamilton’s equations 


z ðH ael Gaal 


=z = — Ī_ = — Ti a] \ 
ial eR Ua EoD 0.1) 


(where p; and q; are respectively the generalized momenta and generalized co- 
ordinates, f is the number of degrees of freedom equal to 3N, N is the number 
of material points in the system, and H is the Hamiltonian), use is often 
made of graphical methods. One of these methods is the motion of a state of 
a mechanical system in phase space. 

This is the space in which the generalized coordinates and momenta are 
chosen as the coordinate axes. 

Consider first the case of a system with one degree of freedom. Let the 
dependence of the coordinate q and momentum p on time be known. Then 
graphs of q(t) and p(t), showing the change of these quantities in time, can be 
plotted. It is more convenient, however, to have a graph representing a se- 
quence of the states of a system, than to have individual graphs presenting 
the change of the position and momentum of the system. In order to obtain 
the graph of a sequence of states it is necessary to match the two graphs 
q(t) and p(t), eliminating the time from them. We choose the generalized 
coordinate q as the x-axis, and the generalized momentum p as the y-axis. 
The curve in fig. II.1 shows the change of states of the system. Thus, for 
example, at point 1 the system had a coordinate q} and a momentum p}, at 
point 2 it had a coordinate gy and a momentum p and so on. As the time in- 
creases the coordinate q of the system and its momentum vary according to 
the law shown by the graph. The space of fig. II.1 is called phase space. It 
should be strongly emphasized that phase space has nothing in common with 
real space and is a purely imaginary concept. 

To each point of phase space there corresponds a definite state of the 
system. A point whose position in phase space characterizes a state of the 
system is called the representative point. As the state of the system changes, 





At 





Fig. UL1 





| 
Í 
{ 
| 





ESO e o e a r o e a e 


4 THEORY OF PROBABILITY Ch. 1 


i.e. as the position and momentum of the system vary in real space, the posi- 
tion of the representative point in phase space changes and the point de- 
scribes a certain phase trajectory. The form of this trajectory is not at all like 
that of the real trajectory. However, they are connected, as well as with the 
law of change of the momentum, by a one-to-one correspondence. 

In order to picture the above more clearly, we shall consider the motion 
of a linear harmonic oscillator moving under the action of a quasi-elastic force 
F=-—kq about the origin q=0. The equation of motion has the form 


mq =—kq. (1.2) 
It is easily integrated. We have 

q =A sin (wita). (1.3) 
In this case the momentum of the oscillator is equal to 

p=mwA G68 (wtta). (1.4) 


where A is the amplitude and @ is the phase, determined by the initial condi- 
tions; the frequency is 


w= 2nv=VJk/m. 


Formula (1.3) represents the equation of the real motion. In order to find 
the trajectory of the representative point in phase space it is necessary to find 
the relation between p and q. Squaring and adding up eqs. (1.3) and (1.4), we 


find 
PERZ q\2 = 
(Fea) 4 (3) fas 


This is the equation of an ellipse. Thus, for oscillations about the point q = 0 
with an amplitude A the representative point in phase space describes an 
ellipse with semi-axes a= A and b=moA (fig. III.1). Its area S is equal to 
S= $ pdg = nab = nmwA?. Also, calculating the energy of the oscillator e, we 
have 


p? REN D 
e=5 tang? = 4K AZ, (1.5) 


§1 PROBLEMS OF STATISTICAL PHYSICS 





5 
from which we find the important relation 
SK f zi 
c= =p dg . 1.5 
27m BSA ( ) 


The notion of phase space can also be introduced for a system with more 
than one degree of freedom. In this case the number of dimensions of the 
phase space is obviously equal to twice the number of degrees of freedom, 
since for each degree of freedom the coordinate is taken as one axis, and the 
momentum is taken as another axis. In the case of systems with a large num- 
ber of degrees of freedom the phase space has very many dimensions and 
cannot be shown graphically. Nevertheless, in this case the concept of phase 
space also proves to be of great use. In what follows we shall need the expres- 
sion for a volume element in phase space. Generalizing the usual definition of 
a volume element dV = dxdydz to the case of many dimensions, one can write 
the following expression for an element of phase space 


dr = dq; dqa ... dq3ydp dp ... dp3y > (1.6) 


where dq; is the differential of the ith coordinate, and dp; is the differential 
of the ith momentum corresponding to this coordinate (i= 1,2,...,3V). The 
product on the right-hand side of the expression (1.6) involves the differen- 
tials of 3N generalized coordinates and 3M momenta as factors. 

In addition to the idea of phase space, use is often made of the ideas of 
configuration space and momentum space. 

The representative space of 3N dimensions, in which generalized coordi- 
nates are chosen as the axes, is called the configuration space. The set of all 
positions of the particles of a system is characterized by the position of a 
representative point in the configuration space. An element of the configura- 
tion space is 


WV eons = 441 . dd3y 3 (1.7) 


In momentum space the 3M components p], P2 ---,P3, Of the momenta of 
the particles serve as the coordinate axes. A representative point characterizes 
the value of all the momenta of the particles. An element of the momentum 
space is equal to 


Caen E i dpa ie (1.8) 








6 THEORY OF PROBABILITY Ch. 1 


so that 
dP = dVeonr I¥mom ° (1.9) 


The notion of phase space, as well as those of configuration and mo- 
mentum space, are very useful for a clear representation of the laws of 
statistical physics. 

The motions of mechanical systems are determined by the so-called 
dynamical laws. A characteristic feature of a dynamical law is that, if the 
initial state of the system and the action of surrounding bodies on the 
system are known, the state of the system at any subsequent instant of time 
can be determined unambiguously. In other words, for given forces acting 
on the system the initial state of the system unambiguously determines its 
subsequent overall motion. 

General features characteristic of a dynamical law show up not only in 
mechanics but also in a wide range of other physical phenomena, in particular 
in electrodynamics. However, it would in principle be incorrect to state, as 
was done by a number of investigators, beginning with Laplace, that dynam- 
ical laws exhaust all forms of causality and mutual conditionality of pheno- 
mena in nature. 

As we shall see below, the behaviour of macroscopic bodies is not given by 
dynamical laws but is determined by laws of another type — statistical laws. 

In the following we shall assume that atoms, molecules and other particles 
forming macroscopic systems move according to the laws of quantum mechan- 
ics. The latter are presented in Part V of this book. Here we shall give, with- 
out proof, the most necessary data and relations. 

A distinctive feature of all microsystems (atoms, molecules, etc.) is the 
fact that in certain conditions they can be in discrete, or quantum, states. 
Experimental and theoretical (see Part V) investigation of the states of atoms 
and molecules has shown that their energy can assume a discrete series of 
values €} , €2, €3, .... The transition between these states, for example, between 
€; and e3, takes place without passing through states with intermediate ener- 
gies between e} and e2. States with intermediate energies do not exist. Thus, 
atoms can absorb or emit energy only in definite amounts or quanta. 

Not only the energy but also a number of other quantities characterizing 
the states of atomic systems have a discrete, quantum, character (for example, 
the angular momentum, which in the atom can assume a discrete series of 
values and can vary only in jumps). The energy and similar quantities are 
called quantized quantities, and the set of their possible values is called the 
spectrum. The quantized values of energy are also often called energy levels. 


§1 PROBLEMS OF STATISTICAL PHYSICS 7 


The existence of quantized states radically contradicts the laws of classical 
mechanics, in which the states of a system can always change continuously. 

At the beginning of the development of atomic theory some formal rules 
were obtained, by means of which it was possible to choose from all states 
allowed according to classical mechanics, those which can actually be realized 
in an atom. These rules were called Bohr’s quantum conditions. 

Subsequent development of atomic physics showed that the concepts of 
classical physics needed an even more profound alteration. 

At present the laws of motion of microparticles are adequately described 
by quantum mechanics. Postponing our detailed acquaintance with them till 
Part V, we shall, in the meantime, make use of some of them. 


1.1. Quantized motion of a particle in a box 

The simplest quantum-mechanical system — a microparticle with mass 
m confined in a one-dimensional box with impenetrable walls — will be con- 
sidered in §8 of Part V. The motion of such a particle is restricted by the size 
of the box to the region O<x <a, where the potential energy is equal to 
zero. At the limits of the region, at x = 0 and x =a, the potential energy of 
repulsion is infinitely large and the particle cannot go out of the box. 

From the general propositions of quantum mechanics it follows that the 
energy and momentum of such a particle should have a discrete series of 
values. The allowed values of the energy and momentum are given by the 
formulae 


} ; 
Pn= a> (1.10) 
2 22 
G =Pno_ hn (1.11) 





Here A is a quantity called the universal quantum constant or Planck constant. 
It is equal to h = 6.62 X 10-27 erg sec. n is a quantity running through the 
integer values (n=1,2,3,...), and called the quantum number. 

Formula (1.11) shows that the energy levels of the particle form a discrete 
series Or spectrum. The separation between neighbouring energy levels is 
equal to 


h2 


8ma2 





Aen = Ene) — En = (2n+1). (1.12) 








8 THEORY OF PROBABILITY Ch. 1 


We see that these separations are smaller the larger the mass of the particle and 
the dimension a of the region of motion. 

In the case of the motion of a particle in a region of sufficiently large di- 
mensions the separation between energy levels is so small that for practical 
purposes they form a continuous spectrum. Similarly any particle with a 
large, macroscopic mass has a continuous energy spectrum. 

The relative separation between energy levels is obviously equal to 


Gail ~ Ga. Dace il (1.13) 
G n2 ` j 





n 


For n > | the relative separation between energy levels or the magnitude of 
the “steps” of the energy spectrum is equal to 


z= 


En I 


Ent1 En 2 (1.14) 


= 


For large quantum numbers the relative separation between energy levels 
decreases rapidly with increasing 7, so that the discrete character of the spec- 
trum is smoothed away. 

Thus we see that the discreteness of the energy levels of a quantum particle 
becomes apparent: 1) for small mass, 2) for motion of the particle in a small 
region, and 3) for small quantum numbers. On the contrary, for large masses, 
for motion in a large region, and for large quantum numbers the quantization 
shows up relatively weakly. 

In order to have a feeling for the orders of magnitude, let us put in some 
numbers. For example, let a proton with mass m, = 1.7 X 10-24 g move ina 
box which has atomic dimensions (a=10-8 cm). Expressing the energy in elec- 
tron-volts we find 


En = 0.027? eV 
and 
Ae, = 0.02 (2n+1) eV . 


For small 7’s the separations between energy levels turn out to be of the same 
order of magnitude as the energies themselves (for example, for n = 3, e3 ~ 0.2 
eV, Ae3 ~ 0.1 eV). 


§1 PROBLEMS OF STATISTICAL PHYSICS 9 


However, the situation is different when a proton moves in a region of 
macroscopic dimensions (for example, a = 1 cm). Then 


€, = 2X 107182 eV (1.15) 
and the separation between neighbouring levels is 
Ae, = 2X 10-18 (2nt+1)eV. (1.16) 


Let the proton have an energy of 2X 10-2 eV (as will be seen in what follows 
such an energy is possessed by atoms in thermal motion at normal temper- 
ature). Then from (1.15) we find n = 108. For such values of n the relative 
separation between energy levels is negligibly small. Thus, when the proton 
moves in a region of sufficiently large dimensions the discrete, quantum 
character of its states is manifested very weakly. The same holds, to an even 
higher degree, for a macroscopic ball with a mass of, say, 1 g. The motion of 
such a ball is described with a high degree of accuracy by the laws of classical 
mechanics. 

The regularities appearing in the special case which we have considered, 
of a particle moving in a box, are of general character. 

In Part V it will be shown that classical mechanics represents the limiting 
case of quantum mechanics into which the latter goes over when effects pro- 
portional to the Planck constant can be disregarded. This is possible for 
phenomena which take place on a relatively large scale, when the masses of 
the particles, the dimensions of the region of motion and so on are sufficiently 
large. 

It turns out that the transition from quantum mechanics to classical me- 
chanics can be made in two ways. Simply, by assuming / = 0, we disregard all 
quantum effects — the existence of the wave properties of microparticles, 
the quantization of energy and other quantities, and so on. However, it can 
also be assumed that # is small but nevertheless different from zero. It turns 
out that in this approximation the wave properties of particles are shown very 
weakly. Particles can be assumed to be moving on definite trajectories, as in 
classical mechanics. However, the quantization of states still arises in the fact 
that only some of the classical trajectories are possible. Such an approximation 
is called quasi-classical (as distinct from the classical approximation, in which 
the quantum properties of particles are not taken into account at all). In order 
to see what the character of the restrictions imposed upon the classical trajec- 
tories consists of, we again turn to the example of a particle in a one-dimen- 
sional box. We assume that use can be made of the concepts of classical 








penne ae 


10 THEORY OF PROBABILITY Ch. 1 





x=0 





Fig. II.2 


mechanics and that the particle can be considered as a material point moving 
between reflecting walls. The phase diagram in fig. III.2 shows the sequence 
of its states. We now take into account the quantization of states and out of 
all possible states we choose those which satisfy the quantization condition 
(1.10). Not all states p = const are possible, only those with a separation de- 
termined by the relation (1.10). Fig. III.2 shows the nth state (the solid line) 
and the (7—1)th state (the dotted line). The number of possible quantum 
states with a momentum lying between p„ = hn/2a and p = h/2a is equal to n. 

We now calculate the area S,, in the phase plane, corresponding to these n 


states. Obviously, 
Sy =fp dx = 2p,a=hn. 


The integral $ pdx denotes the integral of p taken over the total cycle of mo- 
tion, i.e. over the area bounded by the solid straight lines in fig. II.2. This 


integral is equal to 
a 0 a 
foax= fpa- f pax=2 f po. 
0 a 0 


If the lines corresponding to other possible states are drawn, then the entire 
surface will be divided into cells. It is easily seen that the area of all the cells 
is the same and equal to /. Indeed, the distance between possible states along 


the p-axis is equal to 





§1 PROBLEMS OF STATISTICAL PHYSICS 11 


hn h(n—i)_ h 


The area of a cell (hatched in the drawing) is equal to 2(h/2a)a = h. 

Thus, in the quasi-classical approximation to each possible state there cor- 
responds in phase space a cell having an area h. The stationary, possible 
states of a system are those for which the condition 


fp dx =nh (1.17) 


is fulfilled. This equality is called the Bohr condition of the old quantum 
theory. 

The example considered is typical, and the condition (1.17) is of a 
general character. In order to convince ourselves of this we shall consider 
another example — a linear oscillator. A diatomic molecule is an example of 
an oscillator performing small vibrations about the equilibrium position, as 
we shall see below. In quantum mechanics (see §10 of Part V) it turns out 
that the state of an oscillator is characterized by a quantum number k which 
can take on a number of half-integer values 


see k=n+3 


nie 


2P TS) 
k=3,3, 


(n is an integer). The energy of an oscillator takes on a number of values (see 
(10.13) of Part V) 


En = hv (n+4) $ (1.18) 


When an oscillator passes from a given quantum state to a neighbouring, 
lower, one it emits radiation with a frequency 


Em — €n 


YPmn = h SIAD 


equal to the natural frequency of oscillation of the classical oscillator *. 
Comparing (1.18) with formula (1.5') we see that the quantum condi- 


tion (1.18) selects as possible those states of an oscillator for which the re- 
lation 


* In quantum mechanics it turns out that for an oscillator only the transitions be- 
tween neighbouring states are possible, so that mm = n + 1. See Part V. 


— ee o 


EEE — 





12 THEORY OF PROBABILITY Ch. 1 


Fp dq = h(t). (1.19) 


holds. 
All possible paths of the representative point of the oscillator, corre- 


sponding to quantum states 7), n2, ..., are represented by ellipses. The area 
of the ellipse corresponding to a state n differs from that of the ellipse corre- 


sponding to a state n— 1 by an amount 


fra — f pax=h, 
n 


n—-1 


where the subscript denotes the number of the state. In fig. III.2 this area is 
hatched. 

We arrive at the conclusion that a cell in phase space the area of which is 
equal to / corresponds to each quantum state of an oscillator. 

Thus, in the quasi-classical approximation (for large quantum numbers or 
dimensions of the region of motion, and for large particle masses) the condi- 
tion of quantization of states consists of the fact that to each quantum state 
of an arbitrary system there corresponds a cell in phase space having an area 
h. It can be shown that the form of the cell is arbitrary. 

Up to now we have confined ourselves to the consideration of systems 
having one degree of freedom. However, it turns out that the results obtained 
are of general character and can be applied to a system with an arbitrary 
number of degrees of freedom. The state of such a system is characterized by 
the definition of f quantum numbers (for examples see below). The phase 
space of a system with f degrees of freedom has 2f dimensions. 

Again as an illustrative example we shall consider the motion of a free 
particle in a box with ideally reflecting walls, but now having three dimen- 
sions. For simplicity we assume the box to have the form of a cube with 
edge a. Since the motion in any of the three directions is independent and all 
the directions are in principle equivalent, we can write for each of the com- 
ponents of the momentum 


Any hna hng 
IPxl = -7 > pyl == > IP21 = zz ` (1.20) 
The motion of a particle in the three dimensions is characterized by three 
quantum numbers 71), 72, nz which can take on a number of integer values. 
The energy of the particle is equal to 


§1 PROBLEMS OF STATISTICAL PHYSICS 13 


2 x 2 
Bet Byes Ds yah? 


2m 


e= 





2 2 2 
= ptngtn3 2 
sane (nj +n5+n3) . (1.21) 


It is characterized by a number n= Vn? + n3 + n3, but for given n does not 
depend on the vlaue of the separate contributions to this n of each of the 
quantum numbers 7, 72, n3. Because of this, several different quantum 
states can correspond to one and the same value of the energy. For example 
let ny =1, nz = 2, n3 = 2 and ny = 2, ny = 1, n3 = 2. In both cases n= 3, so 
that both states have the same energy. If one and the same energy corres- 
ponds to several different states, then such states are called degenerate. 

The number of states having the same energy is called the degeneracy or 
statistical weight. 

The phase space of a particle in a box has six dimensions, so that it is im- 
possible to present it graphically. However, it can be said that it is split into 
three subspaces of two dimensions, each corresponding to the motion in the 
corresponding direction. A simple calculation leads us then to the conclusion 
that to each state (the triplet of numbers 71, n2, n3) of the particle there 
corresponds a volume %3. 

In the most general case of an arbitrary system having f degrees of freedom 
it can be shown that in the transition to the quasi-classical approximation, the 
motion of the system can be considered in the same way as in classical me- 
chanics, but with a restriction imposed on the possible states: to each quan- 
tum state of a system with f degrees of freedom in the quasi-classical approx- 
imation there corresponds a cell in its phase space having a volume Af. The 
proof of this statement will be given in §41 of Part V. 

In the exposition of statistical physics we shall need, in most cases, to 
consider the motion of relatively heavy particles (for example, molecules) 
moving in macroscopic volumes, as well as the behaviour of macroscopic 
bodies containing an enormous number of molecules. 

For such systems quantum effects play a relatively minor role. Neverthe- 
less, as will be seen later, they cannot be completely disregarded. Hence we 
shall take them into account in the quasi-classical approximation, based on 
the quantization rule quoted above. In other respects, if not specified 
otherwise, the motion of the systems will be considered classically. Of course, 
in certain cases when the mass of the system is sufficiently large, one can 
pass over from the quasi-classical treatment to the purely classical one and 
completely disregard quantum effects. This will be done in Chapter 3. How- 
ever, in general discussions and conclusions we shall assume the states of the 
system to be discrete. 


| 
t 
| 
| 








14 THEORY OF PROBABILITY Ch. 1 


1.2. Number of quantum states 

The notion of the number of quantum states corresponding to the energy 
of a system lying in a given interval between e and e+ Ae, will play an im- 
portant role in our overall further development. We shall denote it by Q(e)Ae. 

We shall first calculate this number for a particle moving freely in a box. 
According to the above, there corresponds to each state a volume h? of the 
phase space. Hence the number of states sought for will be found by calculat- 
ing the volume of the phase space corresponding to the energy of the particle 
lying between e€ and e + Ae, and dividing it by h3. 

In calculating the phase volume we shall make use of the fact that in the 
quasi-classical approximation quantum jumps are small, and we shall assume 
the momentum to vary almost continuously. Then an element of the phase 
space can be written in the form (1.6). The volume of the phase space corre- 
sponding to an energy of the particle smaller than a given value is obtained by 
integrating the expression (1.6) with respect to all coordinates and all mo- 
menta satisfying the relation O0 <p < V 2me. Passing over to spherical coor- 
dinates, we can write 


r= J dxdydz | dpxdpydp; 
(1.22) 





V 2me 3 V 2me 33 
2 
d anv J mee 4np?V _ 4n(2m)? 2 V 
ò 3 0 3 


The volume of phase space corresponding to energies in the range between e 
and e + Ae is 


Ar = èr Ae = 4nmV \/2meAe . (1.23) 


The number of states of the particle with energies which lie between e and 
e + Ae is equal to 


1 or 4nmV ./2me 
dQ = 2(€) Ae = — — Ae = ——~—_ 
(€) Ae n3 de Ae 73 Ae. (1.24) 
Since all quantities in the quasi-classical approximation vary almost con- 
tinuously, we shall often write ôe instead of Ae in formula (1.24), assuming 
ôe to be infinitely small. 
It should be borne in mind that for large values (large quantum numbers) 


§1 PROBLEMS OF STATISTICAL PHYSICS 15 


the number of states corresponding to even a very small interval Ae turns out 
to be enormous. Thus, for example, in the case of hydrogen atoms for 
Ae = 0.005 eV, € = 0.025 eV and V= 1 cm? the value of 2 Ae turns out to 
be about 4 X 1028. This value in practice differs very little from its classical 
limit i.e. infinite (as 40). Nevertheless, the finiteness of the number of 
quantum states, as we shall see later, plays an important role. 

For an arbitrary system having f degrees of freedom 


_ ar 
AT = De Ae, (1.25) 


where AT is determined by formula (1.6). Correspondingly, for the number of 
states we have 


1 or 
= = — — 9 
SN = Q(e)5E ah aE Se, (1.26) 
or 
a= faQ = ary. (1.26’) 


The quantity Q(e) can be called the number density of states per unit en- 
ergy range. In what follows we shall, for brevity, simply call Q(e) the number 
of states with given energy. This should not lead to misunderstanding. 

For what follows we shall need one more sufficiently obvious property of 
QQ. That is, if there is a system consisting of two independent parts, and the 
number of states of each of the parts is equal to Q; and {25 respectively, then 
the number of states of the combined system is equal to Q = 22; 2. Indeed, 
the phase volume of a combined system is by definition equal to dF = 
dI’,dI, from which the above mentioned property follows immediately. 

In the general case 


2=[ 19, (1.27) 


where the product my is taken with respect to all parts of the system. 

We shall make use of this property of &Q in order to estimate the number 
of states of a system consisting, for example, of 100 independent particles 
moving in a volume V= 1 cm with an energy within the interval Ae = 0.005 








16 THEORY OF PROBABILITY Ch. 1 
eV, for e = 0.025 eV and a mass equal to that of the proton. We find 


Q = Q1 R ... = (4X 1028)100 ~ 102860 , 


1.3. Spin 

Up to now, in considering an individual microscopic particle (for example, 
an electron or proton), we have assumed that its state is completely charac- 
terized by the definition of three quantum numbers corresponding to three 
degrees of freedom. However, it turns out that for a complete characterization 
of the state of an elementary particle one more quantum number must be 
specified. 

A large number of particles possess, besides an orbital angular momentum, 
an additional, intrinsic angular momentum which is not associated with a 
spatial displacement. It is called the spin. The smallest angular momentum 
which can be possessed by an elementary particle is its spin. Most elementary 
particles (electrons, neutrons, protons) possess a spin s equal to 4h/2n. This 
means that the spin projection s, onto an arbitrary z-axis in space can have 
two values: 44/27 and —+h/2m. It is said that the spin coordinate takes on 
two values: 4 and —3. The spin of complex particles can be integer as well as 


2 
half-integer, depending on the elementary particles constituting them. 


1.4. The principle of identity of elementary particles 
It turns out that the idea of the discrete character of the states of a sys- 


tem and, in particular, discrete energy levels, allows one to treat a wide range 
of problems which were unsolved in classical physics. However, in addition it 
will be necessary also to take into account some other features of quantum 
systems which basically affect the behaviour of real macroscopic systems. In 
the following we shall often have to deal with systems consisting of numbers 
of identical particles (for example, electrons or atoms of given type). The 
laws of the behaviour of such systems in quantum mechanics differ sharply 
from classical laws. In classical physics, however similar physical bodies may 
be in their properties, in principle one can always follow their individual 
motion and distinguish them from one another. 

In quantum mechanics the situation is radically different. The reason for 
this lies in the fact that in quantum mechanics the principle of identity of 
like particles holds. According to this principle all particles of a given kind 
(for example electrons) entering into a given quantum-mechanical system are 
completely identical. In a system consisting of particles of a given kind the 
states do not change when particles are interchanged. 


§1 PROBLEMS OF STATISTICAL PHYSICS 17 


For example, let the system consist of two electrons. The first electron is 
in a state characterized by a set of quantum numbers ny, and the second 
electron is in a state with quantum numbers 7. If the states of these elec- 
trons are exchanged, then we obtain a state of the system with the same 
energy. At first sight the impression that the states of the system are two- 
fold degenerate may arise. However, from arguments based on the general 
propositions of quantum mechanics as well as from statistical considerations 
it is possible to state that this is not so (see §37 and §64 of Part V). 

The identity of particles of a given kind is so complete that, for example, 
the exchange of an electron in a given state by another is not a physical 
event. Hence it makes no sense to say that electron No. 1 is ina state 1 and 
electron No. 2 is in a state 2. It should be noted that a system of two elec- 
trons is in a definite state. From this statement, which follows directly from 
a number of experimental facts, very important consequences for statistical 
physics are obtained. We shall acquaint ourselves with these in Chapter 5, 
and, in particular, in Chapter 10. 

The properties of systems of particles with integer spin differ so funda- 
mentally from those of systems of particles with half-integer spin that, 
reasoning strictly, it is necessary to speak of two different aspects of quantum 
mechanics: one for particles with integer spin and the other for those with 
half-integer spin. This is apparent from the following. For particles with half- 
integer spin the so-called Pauli exclusion principle holds: “only one particle 
with half-integer spin can be in each quantum state”. 

The exclusion principle is often formulated in a somewhat different way: 
“no more than two electrons with an opposite spin orientation can be in each 
quantum state’. The equivalence of the two formulations is obvious. 

For particles with integer spin there is no restriction upon the number of 
identical particles occupying a given quantum state. Below it will be shown 
that this fact radically affects the statistical behaviour of systems consisting 
of particles with integer spin in comparison with those of half-integer spin. 


1.5. Energy levels of a system consisting of a large number of particles 

Consider a system consisting of M identical atoms or molecules, assuming 
N to be a large number. From general considerations it is clear that, since the 
system is macroscopic, the internal energy of the system must vary continu- 
ously and quantum effects should not be of essential importance. We shall see 
now how a continuous energy level distribution arises when atoms with dis- 
crete energy levels are combined. 

For simplicity we shall consider two hydrogen atoms each in the same non- 
degenerate state with energy €g a large distance apart (in comparison with 





18 THEORY OF PROBABILITY Ch. 1 


their dimensions). At an infinitely large distance the atoms do not interact 
with each other and the energy of the entire system e is equal to the sum of 
the energies of the two atoms, i.e. 


=2 
€ <€q - 


The state of the system will be two-fold degenerate: the state of the system 
when one electron is near the first nucleus and the second electron is near the 
second nucleus will possess the same energy as when the electrons are ex- 
changed, in view of the identity of electrons. 

Now draw the atoms nearer to a distance such that they may interact with 
each other. Calculation shows that when the interaction arises the energy level 
of the system splits into two levels lying close to each other. It is said that the 
interaction has removed the degeneracy. This will be treated in detail in §54 
of Part V. 

If one continues to bring the atoms nearer forming a molecule, the split- 
ting of the levels will increase (fig. III.3, the lower level). Very often the 
energy levels of each of the atoms are in themselves degenerate. Then there 
arise from one energy level not two but a larger number of levels of the sys- 
tem of interacting particles (fig. 111.3, two upper levels). We see that the num- 
ber of energy levels in a system of interacting particles turns out to be larger 
than in a system of separated particles. The degeneracy of the levels is re- 
moved by the interaction. This result is not unique to a system of two atoms, 
but is of general character. If the system is one of atoms, characterized by 
quantum numbers, then in forming a system of strongly interacting particles, 
for example a crystal, all the energy levels of individual atoms split into indi- 
vidual energy levels of the system as a whole. The latter, in general, are not 
degenerate. 

If the number of atoms in the system (or, more precisely, the number f) is 
large, then the total number of energy levels in the system turns out to be 
enormous. They approach each other rapidly with increasing energy (as is 
seen in fig. III.3; compare the first, second and third level), and for large f 
and large excitation energies they merge almost completely, forming continu- 
ous bands of allowed energy levels. 

From this it is clear that the statement about the continuous variation of 
the energy of a macroscopic body is not completely true. The lowest energy 
levels are discrete. As the energy increases energy levels approach each other 
rapidly and the energy of the system becomes continuous. We shall see later 
that the discreteness of the lowest energy levels in macroscopic systems 
fundamentally affects their behaviour at very low temperature, close to ab- 
solute zero. 


§2 BASIC IDEAS OF THE THEORY OF PROBABILITY 19 
H2 H+H 
13A, 





1s+3s,p,d 





1s+1s 


Fig. II.3 


§2. Basic ideas of the theory of probability 


Our next problem is the study of statistical laws in systems consisting of a 
very large number of particles. 

This study will be based on the mathematical apparatus of the theory of 
probability. 

We shall not discuss the theory of probability in the form in which it is 
done in mathematical courses. We shall introduce from the very beginning a 





| 
+ Sener 


ae 


= 


‘eag 











20 THEORY OF PROBABILITY Ch. 1 


special definition of probability, which is completely equivalent to that 
adopted in the mathematical theory of probability but is more obvious and 
convenient in considering probability processes in statistical physics. This 
definition is closely associated with the concept of the relation between the 
probability and the frequency of occurrence of an event, used in every-day 
practice. 

Consider a completely arbitrary physical system which can be in different 
physical states. We assume at first that these states form a discrete series, and 
number them by the integers 1, 2, 3, .... We denote by L any quantity depend- 
ing on the state of the system. The quantity Z can represent, for example, the 
energy, volume, compressibility or any other quantity which is a function of 
the state and changes with a change of the state of the system. We consider L 
to be a single-valued function of the state of the system, so that to each state 
1, 2, 3, ... there corresponds a well-defined value of the quantity L: Ly, Lo, 
L3,-... Conversely, if the quantity L has a value L;, then this means that the 
system must be in the ith state. 

Assume that in the course of a very long time T by virtue of diverse pro- 
cesses taking place in the system its states change in such a way that it passes 
through a sequence of different states 1, 2, 3, ...,7,.... For clarity we assume 
that in the course of the entire time 7 the value of the quantity L is measured 
regularly every At seconds. 

The system will exist in certain states for a long time and will be in them 
often, whereas other states will be occupied by it for only a negligible time. 
As a result of the measurements we shall obtain some value of L more often 
than others. Let.the system spend a time ż;, constituting a part of the total 
observation time 7, in a certain state 7. As a result of N; = t;/At measurements 
it will be found that the quantity L has the value L;. The total number of 
measurements will obviously be equal to N= 7/At. The limit of the ratio of 
the number of measurements giving a value of L equal to L; to the total 
number of measurements, as the latter increases indefinitely, i.e. 

Ni 
w= lim —,, (2.1) 
Noo 


will be called the probability of the ith state w; or the probability of the 
value L;. In other words, the probability of the ith state w; is defined as the 
limit of the ratio of the time z; during which the system is in this state to the 
total observation time T as the latter increases indefinitely: 


§2 BASIC IDEAS OF THE THEORY OF PROBABILITY 21 


ti 
J; = i —, 72 
wi mr (2.2) 


It is necessary to see clearly that the probability of a given state j and the 
probability for the quantity Z to have the value L; corresponding to the state 
i are the same. Hence instead of (2.2) we can write 


t; 
wz; = lim a (2.3) 


T> œ 


where wy, is the probability for the quantity L to have the value L;. 

The definition (2.2) implies the assumption that the limit of the ratio 
t;/T exists. The existence of this limit is ensured in the case where during the 
course of the entire observation time the system is in unchanged external con- 
ditions. If this is not so and the external conditions may vary continuously 
during the course of the measurements, the ratio f;/T may not tend to any 
limit. Thus, for example, if we considered a gas expanding indefinitely, then 
the system would not be in any state for a finite time interval. Its states 
would vary continuously during the course of the entire observation time. 
Hence the limit of the ratio ¢;/T would not even exist. 

In practice one often encounters systems whose states do not vary in a 
discrete way but continuously. In other words, the quantities characterizing 
the state of the system often run through a continuous series of values. In 
this case the probability definition (2.3) does not have a direct meaning. The 
system will spend an infinitely short time in a state in which the quantity L 
has a value exactly equal to L;. Hence, as in other cases where one has to deal 
with continuously changing quantities, it is necessary to speak not of the value 
L; but of a certain interval of the value of this quantity. We therefore have to 
speak of the probability for the quantity L to have a value lying in the interval 
between L and L + dL. We shall denote this probability by dw, . By definition, 


At, 
dw, = lim —> 
Too 


where Ar, is the time during the course of which the system is in the states 
corresponding to the values of L lying between L and L + dL. It is obvious 
that the time Ar, and, consequently, also the probability dw, , are propor- 
tional to the value of the interval dZ, other things being equal. Hence it is 
convenient to write dw, in the form 








22 THEORY OF PROBABILITY Chal 


dw, =p(L) dL, (2.4) 


where p(L) is the probability for the value of L to lie in a certain “unit” inter- 
val. The function p(Z) is called the probability density. It replaces the proba- 
bility itself in those cases when the value of L can change continuously. 

In addition to the definition (2.2), use is also made in statistical physics 
of another definition of probability. 

Instead of considering the changes of state of a system in time, one can 
imagine a set of systems which are identical with the given system but which 
are at a certain instant of time distributed randomly over all possible states. 
Such a system is called a statistical ensemble. We shall determine the number 
of systems in an ensemble which are in different possible states. Out of the 
total number M of systems in the ensemble let N; systems be in the ith 
state. Then the probability that a system will be detected in this state in a 
random measurement is equal to 


N; 
w;= lim —. (2.5) 


N> 


The probability defined by formula (2.5) is called the probability with respect 
to the ensemble. 

Now let there be a complex mechanical system moving on a certain tra- 
jectory in phase space. The complexity of the trajectory rules out the possi- 
bility of observing it directly and the phase points are distributed randomly 
in phase space. The probability of detecting the system in a given region of 
phase space is, according to (2.1), determined by the time of stay of the 
system in this region. Instead of observing the process of displacement of the 
representative point in phase space in time, one can consider an ensemble of 
systems differing in their initial conditions. If the initial conditions are 
distributed randomly, then the probability for the system to be found in the 
ith region of phase space is determined by the number of representative 
points in this region corresponding to different systems of the ensemble. It is 
natural to assume that the number of such points for an ensemble is propor- 
tional to the time of stay of an individual system in this region. The ratios 
figuring in the definitions (2.1) and (2.5) lead to the same value of the prob- 
ability. This assumption in statistical physics is called the ergodic hypothesis. 
We shall return to a discussion of it in §15. 

We shall make use of the two definitions of probability, considering them 
to be equivalent. 





§2 BASIC IDEAS OF THE THEORY OF PROBABILITY 23 


We now consider the definition of certain propositions of the theory of 
probability. 


2.1. The law of addition of probabilities 

Consider a physical system which can be in different states. If the system is 
in the state 7, then it obviously cannot simultaneously be in any state k. The 
simultaneous occupation of the states / and k by one and the same system are 
two events which rule each other out. Assume that the probabilities of the 
states į and k are known. For many purposes it is very important to find the 
probability for a system to be found in one or other of these states — no 
matter in which one. In other words, we want to find the probability for a 
system to be either in the state ¿į or in the state k. To find this probability we 
note that the total time of stay of the system in one or other of the states is 
equal to the sum of the times of stay in the ith and kth state. Hence the 
probability wag sought for is 


i ti + tk li li tk 26 
Wig im = lim + lim == Wit Wg. 2 
itk = Tro m Tool Tool k ( ) 





Formula (2.6) expresses the law (theorem) of addition of probabilities: 
The probability for a system to be in one or other of two states which are 
mutually exclusive, is equal to the sum of the probabilities for the system to 
be in each of the states separately. 

The theorem of addition of probabilities can easily be applied to the case 
of three states or a larger number of states. In the general case the probability 
for a system to be in one or other of the mutually exclusive states i, k, l, ... is 
equal to 


Wi+k+l+... = 2 Wj, (2.7) 


where the summation is carried out over all states i, k, l, ... of the system. 

From the theorem of addition of probabilities there follows an important 
consequence which we shall often use in what follows. 

Assume that the state of a system is characterized by two quantities L and 
M which are independent of each other. For example, L can represent the 
velocity of motion of the system in one direction, while M can represent the 
velocity of motion of the system in another direction, or L and M can be 
respectively the energy and volume of an ideal gas, and so on. Let L have the 
possible values L1, Lo, ..., L;, ..., and M the values M}, Mp, ..., Mx, .... We 
assume that the probability for the system to be in a state in which L is equal 


to L; and M is equal to Mg is known. Let this probability be equal to WL Mg 


a —_ — ae 
= 


=} 
"i 
i 
h 





24 THEORY OF PROBABILITY Ch. 1 


We now find the probability wz; for the system to have the value L; for 
any value of the quantity M. According to the theorem of addition of proba- 
bilities we can write 


WL; = WLM, t WLiM2 + --- + Whim, + = Dy, WLiM; > (2.8) 


where the summation is carried out over all values of the quantity M. In the 
case when the quantities L and M vary continuously the summation in for- 
mula (2.8) should be replaced by integration . 


2.2. Statistical independence and the law of multiplication of probabilities 

The second important proposition of the theory of probability is called 
the theorem or law of multiplication of probabilities. 

Consider two physical systems and assume that they are completely inde- 
pendent of each other. We denote by wz, the probability for the first system 
to be in the state characterized by the value L;, and denote by wm, the proba- 
bility for the second system to be in the state characterized by the value M,. 
The probabilities wz, and wag, are independent if the probability for the first 
system to be in the state j does not depend on whether or not the second 
system is:in the state k. 

The law of multiplication of probabilities for statistically independent 
systems reads: “the probability of the simultaneous occupations of the ith 
state in which ZL =L,, by the first system, and the kth state in which M=M,, 
by the second system, is equal to the product of the probabilities wy, and 


Way.” i.e. 
WLiMg 7 WL;WMk - (2.9) 


The law of multiplication represents a strict definition of the statistical inde- 
pendence of two systems. 

The above reasoning can be applied to two arbitrary independent physical 
systems. Let the first of them spend a time 7wz, in the state with L = L;. If 
this time is sufficiently long, then it can be used as the time of observation of 
the states of the second system. Out of the entire time of observation of the 
second system (Twçz;) it spends a part equal to (Twz;)wmy in the state with 
M=M,,. The probability that the first system is in the state with L = L; and 
that the second system is simultaneously in the state with M = Mọ is equal to 





§3 MEAN VALUES AND FLUCTUATIONS 25 


TWL WME 


lm ——— = ww, > 
mses T LiWMk 


which accounts for the law of multiplication. 

An important consequence of the law of addition of probabilities is the 
very obvious statement that the probability for a system to be in some al- 
lowed state is equal to unity. This means that we shall surely find our system 
in one of the states. The validity of the statement is seen from the fact that 








i Dy, 
Dy, w;= D lim um lim ti =1. (2.10) 


Tool T? oo T 


since, by definition, T = 2 ti. 
If the quantities characterizing the states of the system vary continuously, 
then instead of (2.10) we can write 


faw= O. (2.11) 


In what follows we shall always assume probabilities to be normalized in 
such a way that the sum of all the probabilities is equal to unity. In this case 
we shall speak of the probability normalized to unity. In those cases when 


the initial probability distribution is not normalized, we shall always nor- 
malize it to unity. 


§3. Mean values and fluctuations 


It is now necessary to give a definition of the concept of the statistical 
mean value of a quantity which depends on the state of the system. The idea 
of the statistical mean will play a basic role in the subsequent discussion. 
The statistical mean is a natural generalization of the concept of the arith- 
metical mean which we are used to. 

Suppose that we have a number of values of a certain quantity, for exam- 
ple the velocity of an arbitrary body. We understand the arithmetical mean to 
be the ratio of the sum of all these values to their total number, i.e. a sum of 
the form 


ae 
£i 


UP AUL 


A 





26 THEORY OF PROBABILITY Ch. 1 


2 L;N; 
N A 


where L; is a value of L, N; is the number of measurements leading to this 
value, and X is the total number of measurements. 

By the statistical mean of a quantity L, which we shall denote by Z, 
we mean the limit of the ratio 


z= if 2 L;N; 
L= m. Na: 


Since N; = t;/At and N = T/At, we can write 


É: D h 
L= lim = DH) Si, (3.1) 


where t; is the time during which the system is in the /th state, in which the 
quantity L has the value L,, T is the total time of observation, and wz, is the 
probability that the quantity Z has the value L;. The summation is carried 
out over all states of the system. Formula (3.1) is the definition of the statis- 
tical mean. In what follows we shall for brevity omit the term “statistical” and 
say simply “‘mean value”. 

In the case of systems whose states vary continuously, so that instead of 
the probability w; we must write dw, formula (3.1) should be rewritten in the 
form 


L= fLaw= fip(L) a , (3.2) 


where the integration is carried out over all possible states of the system. 

In calculating mean values we shall make use of the following simple 
theorem: if there are two quantities L and M which are functions of the state, 
then the mean value of their sum (Z+M) is equal to the sum of the means 
L + M. For the proof we note that, by definition, 


(L+M) = D (LitM;)wi = D; Liwi + 2) Mw; =[+M. 


§3 MEAN VALUES AND FLUCTUATIONS 27 


Assume that we know the distribution of the probabilities wy; for a 
quantity L to take the values L;. Then by means of formula (3.1) we can 
find the mean value of this quantity L. 

Thus, for example, knowing the probability distribution for different val- 
ues of the energy of a system, one can calculate the mean value of the energy 
of this system. The question naturally arises: to what degree does the defini- 
tion of the mean value characterize the real value of this quantity. In the ex- 
ample presented it can be asked to what degree the indication of the mean 
energy can characterize the actual energy of the system. It is clear that, if the 
deviations of the quantity from its mean value are sufficiently small, then the 
true value of the quantity can always, without much error, be replaced by its 
mean value. 

In order to give a precise answer it is necessary to introduce a quantity 
which can characterize the deviation of the true value of a quantity L from 
its mean value L. 

At first sight it might seem that the difference L — L could be chosen as 
such a criterion. However, that is not so. The deviation of a quantity from 
its mean value can be large, but will nevertheless play a negligible role if it 
occurs relatively seldom. If, for example, considerable deviations of an en- 
ergy from its mean value occur so seldom that the time interval between two 
successive deviations is very large in comparison with the observation time, 
then such deviations will not be noticed at all in the course of the time of 
observation. However, if deviations from the mean are not very large but 
occur often, then in this case the indication of merely the mean value ZL does 
not characterize sufficiently the true value of the quantity L. One could try 
to choose as the criterion the mean value of the difference L — L, i.e. L — L. 
However, this quantity is exactly equal to zero: 


Ni Sib = If Si = 15 =O. 


(It should be noted that it is not necessary to carry out the secondary averag- 
ing of Z, since the mean L is a constant quantity. But the mean value of a 
constant quantity is obviously equal to the quantity itself.) 

The fact that the quantity L — TL is equal to zero expresses the fact that 
the deviations of L from Z in two directions, the direction of larger values 
and the direction of smaller values, occur equally frequently. In order that 
deviations from LZ in the two directions may not cancel but be added, it is 
necessary to choose as the criterion not the mean difference AL = ea Ga but 
the mean square of the difference (AL). The values of (AL)? will be larger 
the larger the deviations of L from L, independently of the signs of the devia- 


a 


mae 


Í 
i 


i 
4 
i 








28 THEORY OF PROBABILITY Ch. 1 
tions, and the more often these deviations occur. The quantity (AL)2 = 
(L-L)? is called the mean square deviation. The mean square deviation is an 
essentially positive quantity. It takes on its lowest possible value, zero, only 
in the case when L is always exactly equal to its mean value Z. Every devia- 
tion from the mean gives a contribution to the value of (AL)2. 

From the definition of (AL)? we have 











(AL)? = (L-L)2 = L2 — 2LL + (L)2 


=L? — 211 + (L} = I? -(L)?. eS) 


It is clear that for the absolute deviation to be small it is necessary that large 
departures of L from Z should have small probability, i.e. that they should 
occur relatively seldom. Thus, the quantity (AL)? can characterize the devia- 
tion of L from its mean value. If (AL)? is small, then the value of L is always 
close to its mean value. In this case the mean value Z can with sufficient 
accuracy characterize the value of L. The relative error which we shall incur 
by replacing L by its mean value Z can be estimated from the value of the 
quantity 6; = (AL)2/E called the relative fluctuation. 

If ô; <1, then this means that the value of L is on the average so close 
to Z that replacing of L by L does not introduce any considerable error. We 
shall now prove a theorem of basic importance for subsequent developments. 
This theorem reads: 

If there is a system consisting of M independent parts, then the relative 
fluctuation of any additive function * of the state, L, is inversely propor- 
tional to the square root of the number of parts N, i.e. 





ôL = oi (3.4) 

In order to prove this theorem we shall calculate the value of ô; . By the 
definition of an additive quantity, L = =i, L™, where L® is the value of 
the quantity Z for the kth independent part of the system (in order to avoid 
misunderstandings, we write the mdex characterizing the number of the sys- 
tem as a superscript), and the summation is carried out over all independent 
parts constituting the system. 


* By an additive function is meant a function possessing the property that the value 
of this function for a complex system is equal to the sum of its values for all independent 
parts. 


§3 MEAN VALUES AND FLUCTUATIONS 29 


From the law of addition of probabilities it follows that 


N 


re O. (3.5) 
k=1 


We now calculate the mean square deviation of L, i.e. the quantity 
N ]2 


(AL)? =|A PZ L® 
k=1 





For simplicity we assume at first that.the system consists of only two inde- 
pendent parts. Then we have 





[A(L, +L )]2 = (AL)? + 2AL, - AL, + (AL)? . 


Now, since L, and Lə are independent quantities, the mean of the product 
1 2 P q p 
(AL ; (AL 3) is equal to the product of the means 





(AL )(AL3) = (AL}) - (A22) . 


But (AL) = (AL) = 0, so that 





[ACL +L9)]? = (AL)? + (AL)? , 


i.e. the mean square deviation of a system of two independent quantities is 
equal to the sum of the mean square deviations of these quantities. Generaliz- 
ing this to the case of V independent parts constituting a system we can write 


N 7) 


al 27 roll = 27 (AL®)2. (3.6) 
k=1 jid] 


The number of terms in the sum (3.6) is equal to the number of inde- 
pendent parts in the system, i.e. to V. We assume that the fluctuations in dif- 
ferent independent parts of the system are of the same order of magnitude 
(since all the parts of the system are equivalent). Then the value of the sum 
on the right-hand side of formula (3.6) is proportional to the number of 
terms, i.e. to the value of NV, so that 


A oe '* 


=e 


uz 













30 THEORY OF PROBABILITY Ch. 1 


N 2 


A DL ~N. (3.7) 
k=1 


The mean value Z is also proportional to the number of terms in the sum 
of formula (3.6), i.e. to N. Hence the relative fluctuation of the quantity L is 


equal to 
t/a NG) 2, 
— (k)) | 
_ VAL)? _ Jaz )] VN 1 (3.8 
ean 8 





Thus, the theorem is proved. 

As has already been pointed out in the introduction, the problem of statis- 
tical physics is the study of the properties of macroscopic systems consisting 
of an enormous number of particles — atoms or molecules. We shall see later 
that the methods of investigating the properties of such systems are based on 
the application of statistical laws. The application of these laws allows one to 
find the mean values of different quantities characterizing the state of a sys- 
tem. From the theorem presented above it follows that the relative fluctua- 
tions of all physical quantities whose value for the entire system is equal to 
the sum of their values for all the particles are inversely proportional to the 
square root of the number of particles. Since the number of particles in a 
macroscopic system is expressed as a rule by enormous numbers (of the order 
of 6 X 1023), the relative fluctuation of any additive quantity turns out to 
be practically equal to zero. This means that all additive quantities have values 
very close to the means. Therefore the replacement of actual quantities by 
their mean values can be carried out with a high degree of accuracy. The 
mean values of different quantities, calculated on the basis of the laws of 
statistical physics, agree to a very high degree of accuracy with their true 
values. This means that predictions based on probability in practice assume a 
completely reliable character. 

Imagine, for example, that we want to find the pressure exerted by a mole 
of a gas on the walls of the container in which it is confined. By means of 
the propositions of statistical physics it turns out that it is possible to cal- 
culate the mean pressure p of the gas. The true pressure p exerted on the wall 
is not equal to the mean pressure. Depending on the complex laws of motion 
of molecules in a gas, it will take on different values varying rapidly in time 
(fig. II.4), which can be both larger and smaller than the mean pressure. 


§4 NORMAL DISTRIBUTION AND MOMENTS 31 





Fig. 11.4 


Nevertheless, the theorem on fluctuations shows that the relative error 
which we shall incur by replacing the true pressure varying in time by its 
mean value (shown in fig. I.4 by the horizontal line) will be of the order of 
5p = (6X 1023)-2, i.e. the error will amount to ~107!9%. 

It is obvious that such an error lies far beyond the limits of accuracy of 
measurements by the best manometers, and is of no practical importance. 
Hence we can make use of the mean value of pressure, without being con- 
cerned that any error will be committed. The same holds also for other func- 
tions of the state of the system. Examples of these functions will be given 
later. 


§4. Normal distribution and moments 


Returning to the discussion of the properties of the additive quantity 
L =D}, L® it should be noted that in the theory of probability the follow- 
ing very important theorem, called the central limit theorem, has been proved: 
“as the number of terms in the sum increases (as M>) the statistical proba- 
bility distribution for the quantity L tends to the normal (Gaussian) distri- 
bution having the form: 


p(L)dL = 


: | dL”. (4.1) 


1 
V2n(AL)2 =| 2(AL)2 


Applied to physical systems this means that the normal distribution for addi- 
tive quantities, for example energy, will be established in any physical system 
containing a sufficiently large number of independent particles. We shall not 
present the proof of the central limit theorem, but shall confine ourselves to 
the consideration of a characteristic example. 


ey ae 





32 THEORY OF PROBABILITY Ch. 1 


Consider a system of N identical statistically independent particles. Let the 
probability for one of the particles to get into the pth state be equal to p. 
We shall find the probability that 7 particles be found in this state. For this 
we write the probability that n particles be found in the state p and that the 
remaining (=m) particles be found in other states in the form p”(1—p)V-" 
(on the basis of (2.9), since the particles are independent). 

The number of ways in which n arbitrary particles can be chosen out of 
the total number of particles M is equal to the number of combinations of 
N elements taken n at a time. The latter is equal to 


W)\ = TY 
(")- ni(N—n)! ` 


Hence the total probability that n arbitrarily chosen particles be found simul- 
taneously in the pth state is equal to 


! 
wy(n) = Fea pi(1—pyW-" . (4.2) 


The expression obtained is called the binomial law. We shall now assume that 
the system contains a very large number of particles, so that M >n. Then 
omitting n in the exponent of the last factor, we can write 


N! 
wain) = Ven PO -=p 


= NW=1) -.. (N=7+1) nj- pyN = O" (R l 


n! n! N 


where 7 = pN. It is obvious that 7 represents the mean number of particles in 
the pth state. In the limit N > œ we obtain 


-OR 
n! 


a kn wy(n) (4.3) 


This formula is called the Poisson formula. Finally, we shall find the asymp- 
totic expression of the Poisson formula for the case when not only is N very 
large but also the number of particles n in a given state is large. This means 
that n and ñ can be considered to be large numbers (in comparison with 
unity), and the difference n — ñ < 7. 


§4 NORMAL DISTRIBUTION AND MOMENTS 33 
Taking the logarithm of the Poisson formula, we have 
Inw(1) =n Inw—m7—Inn!. 
Making use of Stirling’s formula (Appendix IV), we can write 
Inn! =nIinn—x7n, 
so that 


In w(n) = n In (7/n) + (n—7) 


=—n [in(u) 2). 
ni n 


Using In(1+x) =.x—4x?2.., and” ~ n, we obtain 





= Ff (n—7i)? 
w(n) = const exp [ Ti 4 


The constant is found from the normalization condition. For large values 
of n the summation can be replaced by integration. We then obtain 





dw(n) = w(n)dn = T exp |- ept] dn , (4.4) 


i.e. the normal (Gaussian) probability distribution. The mean square deviation 
of the number of particles in the state considered is equal to 





(An)? = (n—n)? = Pa S 0-7} dw(n)= ñ. (4.5) 


In the case given formula (3.7) turns out to be exact, not approximate. Hence 
the Gaussian distribution can be written in the form 


al A 1 (n-ñ)? 
d = dy - -n —- d , 4.6 
w(n) = p(n)dn A exp] | n (4.6) 





34 THEORY OF PROBABILITY Chal 


which is the same as (4.1). We see, in this particular example, that for large 
values of the numbers N and n the normal probability distribution is estab- 
lished, since the deviation of the numbers 7 from their mean values are on the 
average sufficiently small. 

The mean square deviation characterizes the effective width of the normal 
distribution. The smaller (Am)?, or in the general case (AL), the smaller the 
width of the Gaussian distribution. In the limit (AL)? > 0 the Gaussian dis- 
tribution goes over into the 5-function. In this case the probability of finding 
a value L #L tends to zero, while the probability of the value Z = Z tends to 
unity. The normal distribution has a symmetric character, so that w(L) = 
w(—L), i.e. the probability of a deviation from the mean is the same in both 
directions. If a distribution is not Gaussian, then, generally speaking, it is not 
symmetric with respect to the sign of L. The degree of asymmetry of the 
distribution is characterized by a quantity called the skewness and equal to 


(ALY = L3 — 3L(AL)? = L? — 32 [L? -(Z)?] . (4.7) 














The mean squpre deviation and the skewness are expressed in terms of quan- 
tities L and L” (n=2,3). The latter, defined in general form by the formula 


rn = fin p(L) dL , (4.8) 


are called the moments of mth order. It turns out that, if o(L) is an analytical 
function having derivatives of all orders, then the set of moments of all orders 
(L, L2, L3,...) completely determines the form of the function p(L). Indeed, 
carrying out the Fourier transformation on p(L), we have 


co 


Wo) = f p(t) eiet aL. (4.9) 


The function p(w) is called the characteristic function of the probability 


distribution. 
Differentiating (4.9) with respect to w and setting w=0, we find 


(ora S tome (a)E 


a) E WE -(Ż)r 
(ES w=0 am J OE E N 2D 


§5 THE CORRELATION FUNCTION 35 


oo 
dy wis n = ps PR: 
( Ihe toe ft CORE 2n pii 
-00 


day” 


Hence, if all moments are known, then the coefficients in the expression 


1 w? 


0 2 Oe (4.10) 


Ww) = WO + Ho THY 


are known and, consequently, the function (a) itself is known. Then the 
probability distribution is obtained directly from the transformation 


oo 


p(L)= f Wo) eot dw . (4.11) 


-00 


Sometimes the moments L” or, in any case, the first few moments are known, 
but the probability distribution itself is not known. Then, by finding exactly 
the characteristic function Y(w) (or approximately), one can find exactly (or 
approximately) the probability distribution p(L). 


§5. The correlation function 


In what follows we shall often have to consider random functions. A 
random function is a function f(x) the values of which do not depend in an 
unambiguous way on the variable x. For a fixed value of x, the function f(x) 
can take on randomly all possible values. In this case one can only speak of 
the probability that, for a given x, the function f(x) has a value lying between 
f(x) and f(x) + df(x). For concreteness we shall assume in what follows that 
the random quantity depends on time, i.e. we shall consider a random func- 
tion of time f(t). A process described by a random function of time is 
called a stochastic process. Physical examples of stochastic processes and 
random functions depending on time will be given below. 

The most important quantitative characteristic of random processes is 
their correlation function (or, more precisely, autocorrelation). By the corre- 
lation function K(7) is meant the mean value (with respect to time or with 
respect to the ensemble) of the product of a random function describing a 
stochastic process in a certain system, taken at time ¢, and the same function 
taken at time f +7: 


nee 





36 THEORY OF PROBABILITY Ch. 1 


T 
K(x) = lim 4 f ff) ar, (5.1) 
Tool 
0 
where 7 can be positive as well as negative. For brevity we shall write 
K(7) = f(t) f(t) - (5.2) 


The bar denotes averaging over time. 

In addition to averaging over time, averaging with respect to an ensemble of 
identical physical systems in which a random physical process takes place, 
can be carried out. By virtue of what was said in §2, the two averages are 
equivalent. Hence we can write 


K(r) = (F(t) f(t+7)) . (5.3) 


The brackets <) denote the average with respect to the ensemble. 

The correlation function K(r) is a quantitative measure of the connection 
between the values of a random function at successive instants of time. In 
other words, K(r) is a measure of the rate of change in time of a function 
J(t) describing a stochastic process. 

The values of the correlation function depend only on 7, and not on the 
choice of the value of r. Indeed, because of the uniformity of time, a change 
of the origin cannot affect mean values, so that 





K(T) = f(t) f(t47) = f(t) (t+) . (5.4) 
If the values of a random function f(t) vary so rapidly that its value at an 
instant 1+7 does not depend at all on the value at the instant ¢, then 

K(r) = f() f(t) = 0. (5.5) 
When 7 > œ we have the obvious equality 

K(7>) > 0, (5.6) 


called the property of attenuation of correlation at infinity. 
In the other limiting case 7 = O the equality 


§5 THE CORRELATION FUNCTION 37 


K(0) = [f(D] 2 = 2) (5.7) 


shows that K(0) is the same as the mean-square value (the second moment) of 
the random function f(t). Finally, the symmetry of the stochastic process-in 
time allows one to write the condition 


K(r) = K(-1) . (5.8) 


The actual form ofa correlation function depends, of course, on the nature of 
the random process. However, there exists a theorem relating two important 
characteristics of a random process — the correlation function and the so- 
called spectral density. The random function f(t) can be expanded in a 
Fourier integral: 


co 


f= felt fw) dw. (5.9) 


-co 


The frequencies w form a continuous spectrum. 
We write the mean-square value (f?) in the form 


co 


gum= f Kw)dw=2 f Kw) deo, (5.10) 
0 


-00 


where the function /(%w) is called the spectral density. By definition /(w) is 
an essentially positive function, /(w) being equal to /(—w). 
Substituting (5.9) into the definition (5.7) of (2), we find 


JUD = [faw des! eit’ (ff) fle). (5.11) 
We write the expression for (f(w) f(w’)), making use of the inversion formula: 


dle 


f(w)= 5 J f(t) et! de. (5.12) 


us 
-00 


We then obtain 


(F(w) w) = T Sore rey iero” arar. 
T 





38 THEORY OF PROBABILITY Ch. 1 


Assuming f’ = t+ 7, we have 


(f(w) f(w")) = ms J f elwtw)t eiw'T (f(r) f(t+r)) dt dr 
T 
Zel Sfor elw'r K(T) dt dr. (5.13) 
4n2 


According to (5.4) the correlation function K(r) does not depend on ż, and in 
the last expression one can take K(r) out of the integral with respect.to t: 


Mo) Mop =A SoK) ar f eor a 
M -00 
= feiw’ K(x) dr 8(wtw') . (5.14) 


Substituting (5.14) into (5.11), we have 


(f2(1)) = x J dr dw dus! eiw? K(r) 6(wtw’) 


co 


=5 J JKO eter dw dr. (5.15) 


-00 


Comparing (5.15) and (5.10), we finally obtain 


co 


Meo) = 5 f eter K(x) dr. (5.16) 


-00 


Inverting the last integral, we can also write 


oo 


Ka)= f eor Kw) dw. (5.17) 


-00 


Formulae (5.17) and (5.16) represent the content of the Wiener-Khintchine 
theorem. They relate the correlation function and the spectral density. The 
latter turns out to be the Fourier transform of the correlation function. 

In studying stochastic processes in physical systems we shall encounter 
applications of the Wiener-Khintchine theorem. 





The Kinetic Theory of Gases 


§6.The simplest statistical system — an ideal gas 


It is natural to begin the study of systems containing a very large number 
of particles with the simplest case — an ideal gas. 

The density of matter in the gaseous state is small, so that the mean dis- 
tance between particles is very large in comparison with the geometrical 
dimensions of the particles, the atoms or molecules. Because of this each of 
the particles is relatively distant from the other gas particles for most of the 
time of motion. 

The forces of intermolecular interaction decrease rapidly with increasing 
distance and become negligibly small when molecules are at a distance con- 
siderably exceeding their geometrical size. Thus, a characteristic feature of the 
motion of molecules in a gas is the smallness of the intermolecular interaction 
during most of the time. Because of the absence of interaction the gas mole- 
cules move rectilinearly and uniformly until a collision takes place between 
a given molecule and any other molecule or the wall of the container. In 
collisions of gas molecules with each other or with the molecules of the wall 
of the container the molecules can be considered to be undeformable. This 
means that the collisions between molecules obey the same laws as the col- 
lisions between ordinary hard balls. In the process of a collision an exchange 


39 








40 THE KINETIC THEORY OF GASES Ch. 2 


of kinetic energy and momentum takes place between molecules. Analo- 
gously, when a gas molecule collides with the wall of the container or, more 
precisely, with a molecule of the substance of this wall, it can be assumed 
that the gas molecule is reflected elastically from the wall. 

An ideal gas is a statistical system the particles of which interact with 
one another only through collisions, while during the remaining time they 
move as free particles. 

The motion of every gas molecule is determined strictly by the laws of 
mechanics (in the first approximation those of classical mechanics). Hence, 
by integrating the equations of motion of all the molecules constituting a 
gas, one could in principle find the trajectory of every one of the molecules. 
However, such a calculation would, in practice, present great difficulties. 
Even the integration of the equations of motion of three interacting material 
points (the three-body problem) is a very complicated problem, the general 
case being still unsolved. The general solution of the four-body problem is so 
complicated that ways of solving it are not even known. And in a gas the 
number of interacting particles is expressed by numbers of the order of 102°. 
From the macroscopic point of view, innumerable collisions between the 
molecules and with the wall of the container take place in an extremely 
short time interval. Hence, in order to find the trajectories of all the mole- 
cules of a gas it would be necessary to write down and solve 3 X 1029 inter- 
related equations of motion taking into account the corresponding initial 
conditions. From this it is clear that such a problem is difficult not only in 
practice but also in principle. 

At first sight it may seem that the above statement deprives us of any 
possibility of studying physical laws in systems consisting of a very large 
number of particles. In reality, however, this is not so. Although every one of 
the particles constituting a system is in itself a “mechanical system” and obeys 
the laws of mechanics, the set of an enormous number of molecules is a sys- 
tem differing qualitatively from a system consisting of a small number of 
particles. Laws of a special type, which are not at all inherent in simple 
mechanical systems, and which are called statistical laws, are shown by it. 

Consider a gas consisting of an enormous number of molecules confined 
in a closed container. Such a gas represents a mechanical system with an 
enormous number of degrees of freedom. Knowing the initial conditions, 
it would, in principle, be possible to integrate the equations of motion of all 
the gas molecules and to find their trajectories. Putting aside the question as 
to the practical feasibility of such a calculation, it should be noted that such 
a solution would be of no interest. We know from experience that the proper- 
ties of a gas do not depend at all on the initial conditions, i.e., the initial 


§6 THE SIMPLEST STATISTICAL SYSTEM - AN IDEAL GAS 41 


positions and velocities of the molecules. Thus, for example, the properties 
of a gas in a closed container in no way depend on the method of filling the 
container: independently of whether the gas flowed into the container 
through one opening and slowly, or through two openings and rapidly, after 
the lapse of a certain time interval from the admission, the gas will come into 
a definite state in which it will henceforth be found. 

It is said that the gas comes into an equilibrium state. The properties of a 
gas in an equilibrium state do not depend on its previous history and do not 
change in time. 

It is well known from experience that a gas always tends to occupy the 
entire volume available completely and uniformly. Hence a form of motion 
of a gas in which the density is equal in different parts of the container is 
ruled out or, more precisely, is extremely improbable. No obvious con- 
tradiction with the laws of mechanics would appear in this case. If the state 
of a system depended on the initial conditions, then the latter could in prin- 
ciple be chosen in such a way that the density of the gas in different parts of 
the container would be different. The fact that the state of a gas does not 
depend on the initial conditions of the positions and velocities of its mole- 
cules makes of no use the knowledge of the trajectories of individual mole- 
cules. Assume that we succeeded in overcoming all mathematical difficulties 
and found the trajectory of an individual molecule. Suppose it turned out 
that the trajectory of the given molecule lay almost entirely in one of the 
corners of a cube. It is clear that we could not draw from this any conclu- 
sion about the behaviour of the entire gas. 

A gas as a whole, containing an enormous number of particles, appears to 
be a system differing qualitatively from an individual molecule, and its be- 
haviour obeys different, statistical, laws. In this connection one of the 
founders of statistical physics, Smoluchowski, wrote that even if we were 
able to find the trajectories of the gas molecules we would still make use of 
laws of probability theory in describing the properties of the gas. 

In finding statistical laws we shall look for the mean value of the quanti- 
ties characterizing the state of a gas as a whole. Because the number of 
particles in a gas is very large, it follows from the results of §5 that the mean 
values found will agree to a high degree of accuracy with the true values of 
these quantities. 

Earlier reasoning allows us to bring forward the most important feature of 
“mass”? processes, i.e. processes which are characterized by the presence of 
a large number of more or less equivalent events. This feature is the fact that 
in such processes peculiar statistical laws are shown which are not inherent 
in individual systems or processes. 


f 
1 
f 





42 THE KINETIC THEORY OF GASES Ch. 2 


Consider a closed container filled by a gas. Assume that an equilibrium 
state is established in the gas. We shall try to find the statistical laws deter- 
mining the behaviour of the gas. In correspondence with direct experimental 
data we shall assume that the molecules of the gas are distributed over the 
whole volume of the closed container with a uniform density (i.e. that the 
number of molecules per unit volume is constant everywhere in container). 
We shall also assume that the molecules of the gas have velocities which are 
uniformly distributed over all directions in space. This means that the number 
of molecules moving in any direction must be the same. If this were not so 
and if there existed a preferential direction of motion of molecules, then a 
flow of gas would arise in this direction. It follows from experiment that in 
a gas confined in a closed container and not subjected to any action from 
without no stationary gas flow can arise. The assumption of a uniform distri- 
bution of molecules in space and a uniform velocity distribution over all 
directions is called the assumption of molecular disorder. 

The question naturally arises as to how the uniform distribution of the 
velocities of molecules in all directions is established. It is clear that, if mole- 
cules did not interact with one another at all, then there would be no way 
of changing the initial direction of motion of a molecule. Hence the pres- 
ence or absence of a directed flow would entirely be determined by the 
initial conditions. The establishment of molecular disorder is due to the 
existence of interaction between the molecules. Because of collisions be- 
tween molecules their directions of motion change continually, and a random 
motion with a uniform velocity distribution over directions in space is estab- 
lished in the gas. 

The role of molecular collisions does not only give rise to the establish- 
ment of a uniform velocity distribution over directions. In molecular colli- 
sions a change in the absolute value of the velocities of the molecules takes 
place in addition to a change of the direction of flight. If initially all mole- 
cules had the same velocity, then random collisions between them would 
lead to the fact that some of the molecules would randomly get an excess of 
kinetic energy at the expense of other molecules which would correspondingly 
lose a part of their energy. Owing to this the equality of the velocities of gas 
molecules would be violated, and a certain number of molecules having larger 
and smaller velocities would appear in the gas. In other words, a certain dis- 
tribution of molecules over velocities will arise in the gas. A number of 
molecules having large velocities, and a number of molecules with medium 
and small velocities will appear. 

Our problem is to find the velocity distribution of the molecules of an 
ideal gas. This distribution will be characterized by the mean number of 
molecules having a given value of the velocity. 


§7 THE MAXWELL DISTRIBUTION 43 


From the assumption of the random character of molecular motion it 
follows that the appearance of molecules with a variety of velocities is pos- 
sible, so that the distribution of molecules can be characterized by a certain 
continuous function. Since the velocities of motion of molecules vary con- 
tinually, one has, of course, to speak not of the number of molecules having 
a given exact velocity but of the number of molecules having a velocity close 
to the given one. 


§7. The Maxwell distribution 


Let us denote by dny the mean number of molecules in unit volume of a 
gas having velocity components lying within the interval between v, and 
v,+dv,, vy and vy tduy, v, and v,+dv,. 

We shall assume that the gas is in a stationary state, so that the state of 
molecular disorder, which does not change in time, is established. The num- 
ber of particles with given velocity components does not depend on time. 

It is clear that the mean number of molecules dny can be written in the 
following form 


dn, = n(v,., Vy, v,) dv =n(v) du, dv, dv; } (7.1) 


where n(v,, Vy, vz) = n(v) is the mean number of molecules with velocity 
components Vy, Vy, Uz, in unit interval. The function n(v) is called the 
velocity distribution function of the molecules. 

Since all directions of motion of molecules in space are equivalent, the 
velocity distribution must be isotropic and the distribution function n(v) 
cannot depend on the direction of the velocity. This means that n(v,, Uy, 
v,) cannot be an arbitrary function of the velocity components v,, v,,, Vz but 
must be a function of the argument v = |v| = (v2 + vs + v2)? i.e. of the ab- 
solute value of the velocity, and 


dn, = dn, . (7.2) 
Changing from velocity components to the absolute value of the velocity 
and its direction characterized by the polar angles 3 and y, we can, by 


virtue of (1.67) write that 


dn, = n(v) v? du sin 3 dd dy . (7.3) 





44 THE KINETIC THEORY OF GASES Ch. 2 


The total number of particles in unit volume n = N/V determines the norma- 
lization condition 


n= ee) v? du sin 3 dd dy = Brain) v2 du. (7.4) 


The range of integration with respect to the velocity components v,, Vy, Vz 
or with respect to the absolute value of the velocity v will be considered later. 

Our problem is to find the explicit form of the distribution function n(v). 
Here we shall confine ourselves to the simplest, although not completely 
rigorous, derivation of a form of the distribution function called the Maxwell 
distribution function or, briefly, the Maxwell distribution. In this derivation 
the role of molecular collisions and the proposition of molecular disorder, in 
the establishment of the equilibrium velocity distribution of molecules, is 
particularly clear. 

Let us consider the process of collision between two particles moving with 
velocities vj and v2. Since the forces of intermolecular interaction decrease 
rapidly with increasing distance and are, in fact, different from zero only at 
the moment of direct contact of the particles, we can replace the real process 
of collision by an idealized scheme of the elastic collision between two 
material points. Here the concrete form of the interaction forces does not 
play any role. As a result of the collision, let the velocities of the molecules 
be changed. After the collision the first particle moves with a velocity v3, 
and the second one with a velocity v4. The number of such collisions per 
unit time in unit volume of the gas must be proportional to the number of 
molecules with velocity vı and the number with velocity v3, i.e. to the 
product m(v;) n(v2). Consider further the reverse process of collision. Then 
the velocities of the molecules change from the values v3 and v4 to the 
values v} and v2. The number of such collisions per unit time in unit volume 
is proportional to the number of molecules with velocity v3 and the number 
with velocity v4, i.e. to m(v3) n(uq). 

Because of our assumption that the number of molecules with given values 
of velocity does not change by the processes of molecular collision in a gas in 
a stationary state, we can consider that the number of molecules for which 
the velocities are changing from the values v}, V3 to the values v3, V4 is 
equal to the number of molecules for which the velocities are changing from 
Y3, V4 tO Vj, Y2,i.e. it can be assumed that 


n(v,) n(v2) = n(v3) n(v4) - (7.5) 


The equality (7.5) expresses the balance of particles acquiring and losing 
corresponding velocities. 


§7 THE MAXWELL DISTRIBUTION 45 


Since in the process of collision the energy of the molecules is conserved, 
for the direct and reverse process we can write 


D 2? 
v? +03 =o, A (7.6) 


The equality (7.6) expresses the energy conservation Jaw for the collision. The 
common factor 4m, figuring in both sides, has here been dropped. 

The equalities (7.4), (7.5) and (7.6) represent the whole set of conditions 
which must be satisfied by the distribution function sought. 

From formula (7.6), into which only the squares of the velocities enter, it 
is seen that the functional eq. (7.5) will look much simpler if the square of 
the velocity is chosen instead of its absolute value as the argument of the 
distribution function writing the required function in the form n(v2). This 
does not change the essence of the problem, but allows one to rewrite (7.5) 
in a form which is simpler from the mathematical point of view 


n?) n(v3) = no?) n?) z (7.7) 


The functional eq. (7.7) is easy to transform into a simple differential equa- 
tion. For this we express v4 in terms of V] , Y2 and v3 by means of (7.6) and 
rewrite (7.7) in the form 


DS ON MO EGY NG) ES) 
n(vy) n05) n(v3) n(vj tu; v3). 


Taking the logarithm of this equality, we have 


2 = 
In n7) +I1n n(v3) =In n(v3) + In n(vi+v5—v3) 3 (7.8) 
We differentiate the above equality with respect to the argument vy. Here 
we have to keep in mind that eq. (7.8) is valid for completely arbitrary and 
independent values of v] ,vz and v3 [the value of v4 is determined by formula 
(7.6)] . We have 





l dn(v?) Ai] dn(v?+vž—v2) 
2 2 Di) 29) 2) 
n(vz) dv? — n(v?+v3—v2) d(v?+v3—-v3) 
Analogously, 
1 dn(v3) an 1 dn(v?+v2—v?) 








2 2 Dn?) one) Di OR 
n(v5) dv; n(v{ tv5—v3) d(vj +5 v5) 





46 THE KINETIC THEORY OF GASES Ch. 2 


Comparing the two equations, we obtain 


dn(v?) dn(v2) 
1 i 2 (7.9) 





F 2 
no?) dv? n(v3) dv, 


Since vı and v, are independent variables, and the equality (7.9) must hold 
for completely arbitrary values of the independent quantities v; and v3, it is 
clear that it can be fulfilled only when the right and left hand sides of (7.9) 


are each equal to a constant. 
We denote this constant by —a. Then instead of (7.9) we can write 


l ONY (7.10) 


n(v2) dv? 


In the above equation we have dropped the subscript on the velocity, since 
from the preceding considerations it is clear that the equation must be valid 
for any value of the velocity. 

Integrating (7.10) we find 


n(v2) = Aea" , (7.11) 


where A is the integration constant. The integration constant A can be deter- 
mined from the normalization condition (7.4). By virtue of (7.11) and (7.4) 
we have 


4mA Jena” v2 du=n. (7.12) 


Now let us consider the integration range in the normalization condition 
(7.12). The lower limit of integration in (7.12) corresponds to the lowest 
possible value of the velocity v which is, obviously, equal to zero. As to the 
upper limit, we cannot, of course, indicate the value of the highest velocity 
which can be possessed by a molecule of the gas. However, the form of the 
distribution entering into (7.12) shows that, in essence, there is no need to 
know this value. The integrand decreases so rapidly with increasing argument 
that we shall not make any error by substituting infinity for the upper limit 
in (7.12). Hence the normalization condition (7.12) can be written in the 
form 


4nd | ea? y2dv=n. (7.13) 
ò 


§8 COLLISIONS OF MOLECULES WITH THE WALL 47 


From (7.13) it follows directly that a > 0. Otherwise the integral does not 
exist. The integration gives y 


3 
A=n(a/7)2 . (7.14) 
Finally, the distribution function can be written in the form 
3 2 
n(v) = n(a/m)2 e% . (7.15) 


The number of molecules per unit volume with velocities between v and 
u+dv is, thus, equal to 


dn, = 4nn(alm)? e-a? y2 dv. (7.16) 


Formula (7.16) is called the Maxwell distribution. 
In addition to the velocity distribution one can also write the distribution 
over the components of velocities 


Bi 
dn, = n(a/n)? exp [-a(v2+v?+v2)] dv, dv, dv, . (7.17) 


The transition from (7.16) to (7.17) corresponds to the ordinary transforma- 
tion of polar coordinates into Cartesian coordinates. 

In formula (7.17) it can be assumed that the components of the velocity 
vary from —° to +00, 

It should be noted that the velocity component distribution function can 
be written in the form of the product of three velocity component distribu- 
tion functions 


dn, = dn, dns dn, = (7.18) 


= n(vVaļnr e70 duy) (Van e7 dv, ) (Va/7 enw du,) . 


Before proceeding to a discussion of the results following from the distri- 
butions (7.16) and (7.17) it is necessary to explain the meaning of the param- 
eter @ appearing in them. 


§8. Collisions of molecules with the wall of the container. Pressure. The 
connection of the parameter a with the absolute temperature 


During their motion the molecules of a gas confined in a container undergo 


a a Se oe 


= 





48 THE KINETIC THEORY OF GASES Ghez 


CO 


DALU LUAU UCUL 


Fig. III.5 





collisions with its walls. The walls of the container have molecular structure 
and do not form geometrically sharp boundaries. A gas molecule approaching 
the wall undergoes a very strong repulsion by the wall molecules, and is re- 
flected back into the container. Fig. III.S shows schematically the form of the 
potential energy of a molecule near the walls of the container. The latter can 
be considered as an infinitely high impenetrable potential barrier for mole- 
cules. It can be assumed that the reflection of a molecule from the wall of the 
container is completely elastic. This means that, in the reflection, the velocity 
component perpendicular to the surface of the wall changes to its exact 
opposite. 

Consider a certain wall area dS, perpendicular to the x-axis. Then a mole- 
cule having velocity components vy, Vy, vz acquires, when reflected, velocity 
components —v, , Vy , Uz- In the reflection of the molecule from the area the 
momentum component along the x-axis changes from the value mv, to —mv,, 
i.e. by the amount 2mv, . This momentum is transferred to the reflecting wall. 

Thus, collisions of molecules with the wall lead to the appearance of a 
force acting on the surface of the container. We identify the force acting on 
unit surface of the wall by all molecules of the gas with the macroscopic 
pressure. This statement which, in essence, is the basis of the kinetic theory 
of gases, seemed at one time to be very radical. However, at present it ap- 
pears to be natural and completely obvious. 

In order to find the pressure exerted on the wall it is necessary to calculate 
the total change in the momentum of gas molecules undergoing reflection 
from unit surface area of the container in unit time. It is obviously equal to 
the change in the momentum in one collision with the wall multiplied by the 
total number of collisions per cm2 of the surface per sec. The change in the 
momentum is equal to 2mv,.. Multiplying this expression by the number of 


§8 COLLISIONS OF MOLECULES WITH THE WALL 49 


impacts per cm? per sec by molecules having a given velocity component v}, 
and summing or, more exactly, integrating this product over all values of vy, 
we find the pressure sought. In unit time the surface of the wall will be 
reached by all molecules which are a distance from it smaller than or equal to 
v, (since v, is the path traversed in unit time by a molecule moving in the 
positive direction of the x-axis). All molecules contained in a parallelepiped 
of a height v, and a base of 1 cm? will impinge on 1 cm? of the wall surface 
per second (fig. III.6). The volume of this parallelepiped is, obviously, equal 
to v, cm3. It contains dn „vy molecules whose velocity components lie be- 
tween v, and v,+dv,, v, and vy tdv, v, and v,+dv,. The wall surface 
will be reached by all molecules contained in this parallelepiped, independent- 
ly of the values of the velocity components Vy and v, parallel to this surface. 

The number of particles with given velocity component v, (for arbitrary 
values of the two other components vy and v,) per unit volume is equal to 


co 
3 2 ee 
dn, = n(a/m)2 e do, duh GETTE) dv, dv, 
-00 


j ERS 
= n(a/m)2 e dv, . 


wall surface 








Fig. II.6 











50 THE KINETIC THEORY OF GASES Ch. 2 


The number of particles contained in the parallelepiped of volume v, is 
correspondingly equal to 


Ty ad 
dv =v, dn, = n(a/m)? e mciuerdver (8.1) 


This expression gives the number of molecules having the given value of the 
velocity component v, and reaching 1 cm? of the wall surface per second. 

Each of dv molecules impinging on the wall transfers to it a momentum 
2mv,,, so that molecules with a given value of v, transfer to the wall each 
second a momentum equal to 2/v, dv. Integrating this expression over all 
possible values of the velocity component v,, we find the momentum trans- 
ferred to 1 cm2 of the wall surface per second by all the gas molecules striking 
it. This rate of transfer of momentum is, obviously, equal to the force acting 
on 1 cm? of the surface, i.e. to the gas pressure p 


co 
2 
p= n(a/m)2 Í 2mvy e€ “7 v, doy . (8.2) 
0 


The integration in (8.2) is carried out only with respect to positive values of 


vy, since molecules with negative values of the velocity component along the 


x-axis are moving away from the wall considered, not toward it*. It gives 


Pp = : (8.3) 
or 
-mN 
P=: (8.4) 


To determine the numerical value of a it is necessary to compare eq. (8.4) 
with the experimental value of the pressure of a sufficiently rarefied gas. The 
latter is given by the equation of state 


pV=NKT. 


Comparing this expression with (8.4) we see that the parameter a is connected 
with the absolute temperature T by the relation 
* In expression (8.2) collisions between molecules are not taken into account. How- 


ever, molecules which do not reach the wall transfer their momentum to those reaching 
it. 


§9 PROPERTIES OF THE MAXWELL DISTRIBUTION 51 


m 
oe (8.5) 


| 


a= 


N 


The equality (8.5) confirms that the parameter a, which we introduced 
formally, is essentially a positive quantity. In determining the connection 
between a and the temperature T we had to resort to experimental data. In 
what follows we shall encounter a parameter analogous to a. The problem of 
the determination of the meaning and value of the parameter will then be dis- 
cussed in more detail. We shall see that the meaning of œ can be explained 
without directly involving experimental data. 

Let us find the number v of the impacts of molecules per cm? of the wall 
per second. Formula (8.1) gives the number of molecules reaching the wall 
per second and having a velocity between v, and v,.+dv,. The total number 
of molecules striking 1 cm? of the wall per second is obtained by integration 
of (8.1) over all values of v, from zero to infinity. This gives 


co 


2 
v=n(aln)? fe vy duy = n(kT/2nm)? . (8.6) 
0 


§9. Properties of the Maxwell distribution 


Now we rewrite the Maxwell distributions (7.17) and (7.16), expressing 
the parameter & in terms of the absolute temperature of the gas 





Dig Dee. 
3 m(v~tu~ +v?) 
È m ay 
dn, =n (Er) exp [- oe du, dv; , (9.1) 
oes te 2 
£ é m: 2 _ mu 2 
dn, ann( 5%) exp wy dv . (9.2) 


Instead of making use of the state distribution function of gas particles, we 
can introduce the equivalent distribution, of the probabilities for a particle to 
be in a given state. 

If the mean number of molecules having given velocity is equal to dn per 
cm3 of the gas, while the total number of molecules is equal to n, then, ob- 
viously, the probability for an arbitrarily chosen molecule to be in a state 
with given velocity is equal to 


dw= tt 


| 
| f 52 THE KINETIC THEORY OF GASES Ch. 2 
| n` 


IiE Hence the Maxwell distribution function can be treated as the function of dis- 
| tribution of the probabilities for a molecule to be in a given state. The latter 
is characterized by the values of the velocity components vy, Vy and v,. 
Figs. IlI.7 and III.8 show, respectively, the functions of distribution of 
the probability density nl (dm, /dv,) over the velocity components and 
ni (dn,/dv) over the absolute value of the velocity. Since it is impossible to 
show graphically a function of three variables, the quantity n-l(dny, /dv,.) is 
plotted on the vertical axis, and the quantity (m/2kT)?v,, is plotted on the 
horizontal axis in fig. 11.7. 


Bene ee RN i 


| 
| 
i 











10 — ——— 
os 
ls 06 
v 
-ic 04 
a2 
o = et 
o 70 ie 20 25 
(Grr) ra 
Fig. 111.7 
1o rr r 
os 
06 
> 
sb 
“Kos 
02 
o 
os O G 20 25 
—m_)3 
Fr) 4 


Fig. II.8 





§9 PROPERTIES OF THE MAXWELL DISTRIBUTION 53 


In fig. III.8 the function n7!(dn,,/dv) is plotted on the vertical axis, and the 
quantity (m/2kT)20 is plotted on the horizontal axis. 

The numbers of molecules in a gas with very low and very high velocities 
turn out to be relatively small. Nevertheless, it is always possible to find a 
certain number of very fast and very slow molecules. At first sight it may 
seem strange that the velocity distribution function has a maximum. Indeed, 
the factor exp(—mv2/2kT) decreases exponentially with the square of the 
velocity of the molecule. Hence the number of molecules with given velocity 
must be lower the higher the velocity. However, the second factor v2 varies 
in the opposite direction and increases with increasing velocity. This factor 
characterizes the number of states of a molecule having a velocity lower than 
the given one. The competition of the two factors leads to the appearance of 
a maximum in the distribution function. 

With increasing temperature the distribution becomes more and more 
gently sloping. This means that the relative number of molecules with a 
large value of the velocity increases progressively. Fig. II.9 shows the varia- 
tion of the Maxwell distribution with increasing temperature (for oxygen 
molecules). 

In addition to the velocity distribution one often has to make use of the 
momentum distribution and energy distribution of molecules. 

Introducing into the distribution (7.17) the new variables 


Py = Mv, , Py ZMV, , Pz 7 MU; , 


we find for the number of molecules with given momentum 







T=273% 
on SOLS 


AN (014) 
R 


o 400 800 1200 
v (m/sec) 


Fig. III.9 





54 THE KINETIC THEORY OF GASES Ch?72 


nE Pe EE) 
oP del palmmericneaczcr 2... 9) 
Analogously, the number of molecules with absolute value of the momentum 
lying between p and p+dp is equal to 


3 2 
2 p 2 
exp| zar |e dp . (9.4) 





x 1 
Hy y amn (zar) 
Expressing the momentum in terms of the energy of the molecule 
p = (2me)? we obtain for the number of particles with given energy 


dn, =n was e€/8 etde 5 (9.5) 
ETP) 


Experimental verification of the Maxwell distribution was one of the most 
important problems of molecular physics. Hence several methods of measur- 
ing the velocity distribution were devised. The most obvious of these is an ex- 
periment similar to the well-known Fizeau experiment on the determination 
of the velocity of light. 

Molecules evaporating from the surface of a hot filament are let through a 
collimating system of slits, forming a narrow molecular beam travelling in 
vacuum in the direction of a cold trap. Two rotating discs with radial slits, 
through which the beam must pass, are placed in the path of the beam. In 
order that molecules having a velocity v, may pass through both slits for a 
given angular velocity w of the rotation of the discs and distance / between 
them, the slits must be displaced by an angle y equal to 


l 
= t= —. 
y=w Ors 


Thus, to each velocity v, at given angular velocity w there corresponds a 
definite displacement of the slits by an angle y. 

The number of particles in the beam having given velocity was determined 
directly from photometric measurement of the thickness of the deposit on 
the trap cooled by liquid air. 

After evaluating the velocity distribution in the molecular beam corre- 
sponding to the velocity distribution in isotropic conditions, one obtains the 
histogram shown in fig. III.10. The curve in the drawing represents the Max- 
well distribution (for mercury vapour). 


§10 CALCULATION OF CHARACTERISTIC QUANTITIES 55 





90 140 190 240 290 340 390 
v (m/sec) 


Fig. 111.10 


Another very accurate method of measuring the velocity distribution in 
the molecular beam of the vapour of lithium, sodium or similar gases is based 
on the investigation of the behaviour of the beam in a magnetic field per- 
pendicular to the direction of motion of the beam. It represents a reproduc- 
tion of the well-known Stern—Gerlach experiment for the determination of 
magnetic moments, but it cannot be described here in detail (see Part V). 


§ 10. The calculation of characteristic quantities 


Knowing the distributions (9.1) and (9.5), one can find the mean values 
of any quantities characterizing the properties of gas molecules. Let us find, 
first of all, the mean value of any velocity component, for example Uy. By 
definition of the mean value 


S e Be 


co 2 Dire Dh 
3 mu m(v“ tu“) 
ti NA x y z 
Ea f vy exp( -a Ja, x ff exp| eE] dv, du, . 
oo -00 





The integral with respect to du,, which is an odd function, is equal to zero. 
Hence the mean value of v, is equal to zero: 


v, =0. 





56 THE KINETIC THEORY OF GASES Ch. 2 


This result is quite obvious. It shows that both directions of motion along 


the x-axis are equally probable. 
We would obtain a similar result in calculating any other velocity com- 


ponent. 
We now find the mean value of the absolute value of the velocity. We have 


tis dn, m 
=" TBE ye 
x n A ii exp | pale guj 


Fe). von 


In accordance with the foregoing the mean velocity of a molecule increases 
with increasing temperature. This increase is proportional to the square root 
of the absolute temperature of the gas. We see also that the mean velocity of 
a molecule is inversely proportional to the square root of the mass of the 
molecule. 

Of great interest is the mean value eof the kinetic energy of a gas mole- 
cule. From (9.5) it is equal to 








co 


dn 
ga Jea 2A f exp [—e/kT] ede. 
0 


7 /n(kT)? 


In calculating the above integral it is necessary to introduce a new variable 
x =e7. A simple calculation leads to the formula for the mean value of the 
kinetic energy of the translational motion of a molecule. 


€=3kT. (10.2) 
It is seen that the mean energy of a molecule does not depend on its nature 
and is proportional to the gas temperature T. The mean energy £ of all gas 


molecules in a container is equal to the sum of the energies of the transla- 
tional motion of all the molecules, since there is no interaction between them: 


E=Ne€=3NkT. (10.3) 


where N is the total number of molecules in the gas. 


§10 CALCULATION OF CHARACTERISTIC QUANTITIES 57 


The energy of a given portion of an ideal gas does not depend on the vol- 
ume of the container and is determined only by the absolute temperature *. 

We shall identify this mean energy of the mechanical motion of gas mole- 
cules with the macroscopic thermal energy. In this connection we must treat 
the absolute temperature, from the kinetic point of view, as a quantity charac- 
terizing the mean energy of the motion of molecules. At present such a treat- 
ment is the only possible one. In the following chapter we shall discuss these 
statements in detail. 

The expressions obtained for the mean energy of a molecule and for the 
gas as a whole can be interpreted in the following way. Every molecule has 
three translational degrees of freedom, and its motion can be resolved into the 
motion in three mutually perpendicular directions. By virtue of the equiva- 
lence of all directions in space, the mean energy of motion in each direction 
must be the same. Thus, formula (10.2) means that there corresponds to each 
degree of freedom of the motion, on the average, an energy equal to $kT. This 
statement is a particular case of the general law of equipartition of energy 
over degrees of freedom. In §39 we shall discuss this law in detail and point 
out the limits of its applicability. 

It is of interest to establish the connection between the energy e of the gas 
and its pressure p. 

Writing the expression (8.2) in the form 


* It may seem that the independence of the mean energy of a gas molecule on the 
size of the container contradicts the quantum formula (1.11) for the energy, according 
to which ey, ~ 1/a”, where e is the linear dimension of the container. However, it should 
be borne in mind that the mean energy is determined by the integral of the product of 
the energy and the number of quantum states. The latter is, according (1.24), propor- 
tional to 2(e)Ae ~VifeAce ~a (Acla) ~a? Ac. This leads to the independence of the 
energy on the volume. 


4 
| 58 THE KINETIC THEORY OF GASES Ch. 2 
i 


muž 5 
exp a] (mv;) dv, = 





mv; tv.) 


co 2 2 oo 
2 2mv mv 
m 2 x x F 
( ) 2 exp; a oe SS ESR [ KT Jor, dv, 





a 
4 
À 
4 


2 Dn, D3 
mv: +u;) 


2 oo 
3 2 mv mov 
m_\z 32 ay: ~ See 
+ (eal f 7 exp| RT foo, JJ exp| OT Ja, du, 


0 
2 co Pde?) 
3 Ay mv m(v~ tv 2] ] 
m \r Zz Z p x y a 
en) J Be SXP [a Je J exo] aar Nix Wy 

















2 
mv 
-3er | dv, dv, = 


i 
3 
TÈ 
N 
oO 
* 
no} 
(=| 


-3y 3ye (10.4) 


we see that the pressure of an ideal gas turns out to be numerically equal to 4 
of the kinetic energy of the translational motion of gas molecules contained 


in unit volume. 
Finally, the velocity of gas molecules at which the Maxwell distribution 


has a maximum, i.e. the most probable velocity vp; is determined from the 
condition 


d mu? 2\= 
ao (e*> 2k1 } )=0. 
We easily find 


tmp = aj ; (10.5) 


Comparing (10.5) with (10.1) we see that the mean velocity of the molecules 
is 13% higher than the most probable velocity. 





§11 COLLISIONS OF MOLECULES WITH EACH OTHER 59 


5y P . 
The notion of the root-mean-square velocity (v2)?, which characterizes 
the energy of gas molecules, is also often introduced. By virtue of (10.2) 
this quantity is equal to 


ot = (3 EF- 1.22 (ear (10.6) 


m m 





The root-mean-square velocity is 22% higher than the most probable velocity. 
This is quite natural, since the contribution of fast molecules to the energy 
must be larger than that of slow molecules. 


§ 11. Collisions of molecules with each other 


Consider two molecules moving in an ideal gas with the velocities v} and 
v2. It is obvious that for the collision of these molecules with each other 
the absolute values and directions of the velocities do not in themselves play 
any role. The relative motion of one molecule with respect to the other only 
is of importance. If, for example, both molecules are moving on a straight 
line one following the other, then a collision will take place in unit time if the 
second molecule manages to “reach” the first molecule in 1 second. The 
velocities of motion of the two molecules in space with respect to the walls 
of the container are not relevant. 

Thus, in solving the problem of collisions it is necessary to consider their 
relative motion. In §42 of Part I it has been shown that the motion of two 
particles can always be resolved into the motion in space of the common 
centre of mass and their relative motion. 

We write the probability that the first molecule has a velocity vı and the 
second v3 in the form 


dw) = dw; dwz = 


mı ao pesa m \ġ my} p 
kT PI ET LNT) RI oer 2: 


We pass over from the variables v} and v, to new variables R, Vœ}, making 
use of formulae (42.4) of Part I. Then we have 





60 THE KINETIC THEORY OF GASES Ch. 2 


Oi) BO ae E 


mvt +m,v3 = MR2 + pv? dv, dv> = |Z| dR dv;e1 , 


| where u = mım2/(m;+m2) is the reduced mass, M = mı +m, and [/| stands 


7 for the determinant of the Jacobian of the transformation from Vi, V2 to R, 
4 Vrel 
Bai i 
| | dy. avy m2 

f Ve, ƏR m; +m 

j i I= = ail. 
i] | o2 OD) eca 
4] Vet aR ~My + M5 

i | 
Thus, 





2 2 3 cs 
2 mym? 2 HU el mtm \à MRZI 
avin Pere ag Ma ove 2nkT ) P E- Jar- 
= dwe] We.m - 


We see that the probability of a given state of motion of two particles is 
equal to the product of the probabilities of two independent events: the 
probability for the particles to have a given relative velocity, 


2 
3 pu 
m ; l 
Wrst (zar) exp [- ar |e go) 


and the probability for the centre of mass of the system of the two particles 
to move in space with a given velocity 


M_\} MR?) ,; 
dwom = (Seer) exp [- ORT ] dR. 


In the problem of collisions, only the first probability is of interest. In the 
case of particles with the same mass y = 47m, so that 








§11 COLLISIONS OF MOLECULES WITH EACH OTHER 61 





m \t mel 
dwel = (zer) exP| — kT dV rel - (11.2) 


By means of the probability distribution (11.2) the mean value Ùe] of the 
velocity of relative motion can be found 


2 
mv 


3 co 
Sets m_\*> ESAE Ape 
Dre = 47 (axr) J exp 4kT ic dv el V27. (11.3) 





Thus, the mean velocity of relative motion is almost one and a half times 
the mean velocity of the thermal motion. 

Let us now find the number of collisions undergone per unit time by a 
molecule of a gas having a number density n. We assume that the gas is so rare- 
fied that the molecules only collide in pairs, that is, collisions in which three 
or more molecules come simultaneously into direct contact can be disre- 
garded. The process of collision of molecules can be characterized by their 
cross section o. 

In considering collisions between gas molecules we shall assume that all 
the gas molecules except one are at rest. The molecule singled out moves 
relative to those at rest with a velocity V- In unit time it traverses a path 
Væl and collides with all particles lying in a cylinder of volume ov. The 
number of such collisions is, obviously, equal to ov,,.)d/y,.), Where drivel = 
= ndwe] and dwe is given by formula (11.2). 

The total number of collisions undergone by the molecule per unit time is 
obtained by integrating this expression over all possible values of ve]: 


p=. if Oye] drvel = 


NG oe. mv? 2 
= an( er) n J O(Uze1) exe| -apr Johi dv el - (11.4) 


If the cross section for the collision can be assumed to be independent of 
the velocity, then instead of (11.4) we obtain 





62 THE KINETIC THEORY OF GASES Ghe2 


co 2 
3 mu 
= m 2 rel 3 2 
p= 4r (az) no f exp -zer | Vel duel 

0 
1 
3 

= NOV ~ = no V2 v = 4no (£) z (11.5) 


This is just the number of collisions undergone by the molecule per second. 


§ 12. The mean free path 


Let us now find the mean path traversed by a molecule between two 
successive collisions, which is called the mean free path. 

In one second a molecule traverses in space a path equal, on the average, 
to D, and undergoes v collisions. The mean path between collisions is equal to 


BU seal a al 
Sy A no/2~ (2) 





The path A is on the average proportional to the ratio of the mean path 
traversed per unit time by the molecule to the number of collisions under- 
gone by it. Hence the quantity À is called the mean free path. The mean free 
path À turns out to be inversely proportional to the gas density n and the 
cross section 0. 

Formula (12.1) gives the mean free path. However, it is often important to 
know the probability for a molecule to traverse an arbitrary path x without 
undergoing any collision. In other words, what is of interest here is the law 
of the probability distribution of free paths for the molecules. We denote by 
w(x) the probability for a molecule to traverse a distance x without under- 
going any collision. 

Correspondingly, w(x+dx) represents the probability for a molecule to 
traverse a path x +dx without undergoing any collision. 

The traversal of the path x+dx is a complex event, consisting of two in- 
dependent stages: the traversal of the path x without collisions and the sub- 
sequent traversal of the path dx also without collisions. 

Since these events are independent, we can write that 


w(xtdx) = w(x) w(dx). (12.2) 


§12 THE MEAN FREE PATH 63 

It is convenient to rewrite the above probability in another form. It is ob- 
vious that the probability w'(dx) for a molecule to undergo a collision on the 
infinitesimally small path dx is proportional to the length dx and can be 
written in the form adx, where a is a certain coefficient of proportionality. 
The probability that a molecule will traverse the path dx without collisions is 
equal to 

w(dx) = 1 — w'(dx) = 1 — adx . 
Substituting this into (12.2) we find 


w(x+dx) = w(x) (1 — adx) . (12.3) 


Expanding w(x+dx) in a series in powers of dx and confining ourselves to 
infinitely small quantities of the first order of smallness, we have 


w(x+tdx) = w(x) + oe dx , 


whence, substituting into (12.3), we obtain 
dw = — Aare) dx. 

Integrating, we find 
w(x) = Ae™. 

In order to determine the arbitrary constant A, we note that the proba- 
bility for a molecule to traverse an arbitrarily small path without collisions is 
equal to unity: 

w(x>0) = 1. 
Whence it follows that A = 1 and, finally, 
wx) =e. (12.4) 
In order to determine the meaning of the constant a we shall find the mean 


free path A, making use of formula (12.4). 
By definition the mean free path A is equal to 





64 THE KINETIC THEORY OF GASES Ch. 2 
co 
r= f xaP, (12.5) 
0 


where dP is the probability that a molecule, after traversing a path x, without 
collisions, will undergo a collision on the segment x, x+dx. According to the 
foregoing, we can write 


dP = w(x) w'(dx) = w(x) adx =a e™ dx . (12.6) 


Substituting the value of dP into (12.5) we find 


daa f xe dv=al, (12.7) 
0 


Thus, the constant a turns out to be a quantity inversely proportional to the 


mean free path. 
Formula (12.4) for the probability that a molecule will traverse a path x 
without undergoing any collision can be rewritten in the form 


W(x) = ex | (12.8) 


This probability turns out to be an exponentially decreasing function of the 
distance. It should be stressed that w(x) gives the probability for a molecule 
to traverse a path x without collisions irrespective of the position where it 
underwent the last collision. This means that the distance x is measured from 
an arbitrary point and not from the position of the last collision. 

The probability for a molecule to traverse a path without collisions and 
then to collide in the segment x,x+dx is, aecording to (12.6) and (12.7), 
equal to 


dP=)-! e-x/A dx . (12.9) 


Formula (12.8) is important for the experimental determination of the 
mean free path of gas molecules. Imagine a narrow collimated molecular 
beam entering a container, evacuated to a relatively low pressure, containing 
a cooled plate placed in the path of the beam, first at a distance x, and then 
at a distance x5 from the inlet. Molecules which traverse the paths x, and x3 


§12 THE MEAN FREE PATH 65 


without collisions will reach the plate, forming a deposit on it. The ratio of 
the number of particles deposited on the plate in the two positions is, accord- 
ing to (12.8), equal to 


Mx 1) A ea 


Taye (12.10) 








e 


By measuring the numbers M(x,) and N(x) and assuming that A is the same 
for all molecules of the beam, A can be determined by means of (12.10). 

In accordance with the requirements of the theory, A turns out to be in- 
versely proportional to the density or, what is the same, to the pressure of the 
gas. The order of magnitude of A amounts to about 10 cm at p © 1073 mm of 
mercury and about 107$ cm at atmospheric pressure. 


i 
I 








Statistical Distribution 


§ 13. Quasi-independent systems 


Consider a macroscopic system consisting of a very large number of par- 
ticles. We assume that the motion of all the particles is determined by the 
laws of quantum mechanics; the state of each particle is characterized by 
certain quantum numbers. We divide the whole system into a large number 
of parts in such a way that the interaction between these parts is very weak 
and can be neglected in the first approximation. We shall call the whole 
system under investigation the ensemble of its almost independent parts. 

These weakly interacting parts of the system are, in the first approxima- 
tion, moving independently of one another. However, the interaction exist- 
ing between them leads to the fact that, in reality, the motion of one of the 
parts affects the motion of another, and that they are not completely in- 
dependent. In the following we shall call the weakly interacting parts of a 
large system the quasi-independent subsystems, or simply subsystems. 

We first of all dwell on the problem for which the subsystems forming a 
large system can be considered as independent. 

It is obvious that the subsystems are quasi-independent if the energy of 
their interaction is, on the average, small in comparison with the energy of 
each of the subsystems. This means that in some cases the interaction between 


66 


§13 QUASI-INDEPENDENT SYSTEMS 67 


the subsystems can be relatively large, but the duration of the interaction 
must be so small that they spend an overwhelmingly large part of the time 
without interacting with each other. 

As an example of such subsystems one can mention the molecules of an 
ideal gas, which interact strongly with one another only seldom and then for 
a very short time interval. 

In other cases a continuous, but weak, interaction can exist between sub- 
systems. Imagine, for example, that each of the subsystems of the system 
contains a very large number of particles (atoms or molecules) and, thus, re- 
presents a macroscopic system. Then the total energy of a subsystem, made 
up of the energies of motion of the individual particles, is proportional to 
the total number of particles of the subsystem. The number of particles is in 
its turn proportional to the volume of that subsystem. The interaction be- 
tween various subsystems is mainly due to the forces of molecular inter- 
action between the molecules which are at the surface of each of the inter- 
acting subsystems *. These forces decrease so rapidly with distance that the 
contribution to the energy of interaction of molecules which are deep inside 
the subsystems is small in comparison with that of the surface molecules. 
Hence the energy of interaction between the subsystems is proportional to 
the number of molecules at their surface, i.e. to the size of the surface. 

Thus, the energy € of a subsystem is proportional to R3, where R is the 
characteristic linear dimension of the system, and the interaction energy is 
Eint ~ R2. Their ratio 


becomes negligibly small for a sufficiently large N. 
The energy of the entire ensemble of quasi-independent systems can then 
be assumed to be equal to the sum of the energies of the individual parts, i.e. 


Ex, (13.1) 


* We shall nowhere take into account the gravitational interaction between the mole- 
cules, because it is very weak and does not play any part in molecular processes. How- 
ever, it should be noted that, in studying the macroscopic properties of matter in 
astrophysical problems, the gravitational field is in a number of cases of essential im- 
portance and must be taken into account. 





68 STATISTICAL DISTRIBUTION Ch. 3 


where the sign œ= underlines the fact that in writing (13.1) we have disre- 
garded the energy of interaction between the subsystems forming the en- 
semble. The summation in (13.1) is carried out over all parts of the system 
(the subsystems). 


§ 14. Statistical distribution 


From the whole system we single out mentally an arbitrarily chosen sub- 
system. This subsystem consists of a certain number of molecules (as we have 
just explained, it can be either small or large, depending on the actual nature 
of the subsystems constituting the system) moving according to the laws of 
quantum mechanics. The energy of our subsystem is not strictly constant, but, 
on the contrary, varies continuously within the range of the value Eint, where 
Eint is the energy of the interaction of the system with its surroundings. 
Although €;,; is very small and can be disregarded in the energy balance, it 
nevertheless plays an essential part in the behaviour of the system. The inter- 
action of the subsystem with bodies surrounding it is the cause of its transi- 
tions from one quantum state into others. This interaction is of an extremely 
complicated character. Even in the simplest case, when individual molecules 
were taken as subsystems, we saw that an attempt to determine the motion of 
each molecule, i.e. the sequence of change of states, presented great diffi- 
culties. In a gas consisting of a very large number of particles new laws are 
introduced which we have briefly formulated in the proposition of molecular 
disorder. The situation is completely analogous also in the general case of the 
macroscopic system consisting of a large number of quasi-independent sub- 
systems. The interaction between the subsystems is so complex that an ac- 
curate determination of the state of each of the subsystems becomes a prob- 
lem even more difficult than finding the motion of individual gas molecules. 
At the same time, such a determination loses any physical meaning. Indeed, 
even if we succeeded in the determination of the state in which a certain sub- 
system is at a given instant, it would, as a result of the interaction with other 
subsystems, pass over into another state in a very short time. Hence the 
particular known state of a subsystem entering into a large ensemble of 
systems, at a certain instant, does not characterize the overall ensemble, just 
as the velocity of an individual gas molecule does not characterize the 
properties of the gas as a whole. In this connection we renounce beforehand 
any attempt to describe the behaviour of an individual subsystem, and in- 
stead look for the statistical laws characterizing the behaviour of the overall 
ensemble of the subsystems as a whole. This means that we shall not try to 


§14 STATISTICAL DISTRIBUTION 69 


trace in detail the successive changes of state of an individual subsystem in 
the course of time, but shall seek to find the probability w; for an arbitrarily 
chosen subsystem of the ensemble to get into a certain ith state. If we find 
this distribution, we shall then be able: 

(1) to find the mean number of subsystems in a given state, if an ensemble 
consisting of M identical subsystems is given (for example, the number of 
molecules in a given state, if the subsystems correspond to individual gas 
molecules); 

(2) to find the mean value of any quantity characterizing the state of an 
individual system, for example its energy, from the general rules set out in 
84; 

(3) to find the deviations of quantities from their mean values, charac- 
terized by the mean square deviation. 

In this case we can apply the general statistical considerations of §5 toa 
macroscopic quasi-independent subsystem consisting in its turn of a very 
large number of particles. These considerations show that in such a system 
all quantities have values differing very little from their mean values. Hence if 
we know the latter, then we can assume, with a high degree of accuracy, that 
we know the true values of all the quantities characterizing a state of a sub- 
system. 

Thus we see that the statement of the problem in statistical physics differs 
in no way in principle from that in the kinetic theory of gases. We shall in- 
vestigate the statistical laws shown in systems consisting of a very large 
number of particles. Knowing these laws, we can calculate the mean values of 
all sorts of quantities. Since, however, the objects in which we are interested 
are macroscopic bodies consisting of a very large number of particles, the 
predictions obtained from statistical laws have a completely reliable character. 
The mean values of all the quantities are the same as the true values to a 
high degree of accuracy. 

However, besides this similarity between the general statement of the prob- 
lem in statistical physics and in the kinetic theory of gases there is also a 
fundamental difference between them. In the kinetic theory of gases an in- 
dividual quasi-independent system was always a molecule of a rarefied gas. 
The molecule was assumed to be monatomic, since only its translational 
motion was considered. The gas as a whole corresponded to our ensemble of 
systems. In statistical physics the problem is stated much more broadly. An 
individual subsystem can be any quasi-independent system. It can be a 
monatomic molecule in a rarefied gas as well as a polyatomic molecule per- 
forming not only a translational motion but also a rotational and vibratory 
motion. The system can also be a gas as a whole, confined to a certain con- 


L 





a a S o 


iid 





70 STATISTICAL DISTRIBUTION Ch. 3 


tainer. The walls of this container and the bodies surrounding it play the role 
of other systems with which the gas in the container (a quasi-closed system) 
interacts weakly and exchanges energy. The gas, the walls of the container and 
the bodies surrounding it form an ensemble of subsystems. A subsystem can, 
for example, be a solid body containing a sufficiently large number of 
particles. The bodies surrounding it play the role of other parts of the en- 


semble. 
Thus, an ideal gas, considered in the kinetic theory of gases, is a particular 


and the simplest case of a general statistical system. 

In the preceding chapter we have, for the particular case of an ideal gas, 
partly fulfilled the programme which we outlined at the beginning of this 
section. We saw that the stationary probability density distribution of dif- 
ferent states of a molecule in the gas is established because of the interaction 
between gas molecules in collisions. 

In the same way, in the general case of an ensemble of arbitrary quasi- 
stationary systems, a certain distribution of the probabilities for a subsystem 
to get into a definite energy state e; will be established. In the following 
sections the derivation of the statistical probability distribution for an arbi- 
trary subsystem will be developed. 


§ 15. The probability of a state of a system 


Let us observe mentally the changes of state of a subsystem which we 
have singled out arbitrarily. We shall, for brevity, call all other parts of the 
system, constituting the surroundings of this subsystem, the reservoir. The 
meaning of this term will be clear from what follows. We shall call the sub- 
system itself, if not specified otherwise, simply the system. Each state of the 
system is characterized by a set of quantum numbers. If the system has f 
degrees of freedom, then its state is characterized by a set of f quentum 
numbers. 

To each set of quantum numbers there corresponds a certain quite definite 
energy of the system *. If the system consists of a large number of particles 
and has many degrees of freedom, different energy levels corresponding to 
different, but mutually similar sets of quantum numbers, lie very close to 
one another. In the limiting case where the number of particles is very large, 


* In what follows we shall dwell in more detail on this problem and take into account 
the case where different sets of values of the quantum numbers correspond to one and 
the same energy, i.e. the case of degenerate systems. 


§15 THE PROBABILITY OF A STATE OF A SYSTEM 71 


so that the system is macroscopic, we pass over from a quantum system to a 
classical system *. In this case all the energy levels merge into a continuous 
spectrum, and instead of discrete levels use can be made of the continuous 
energy range of the classical theory. 

As we have already stressed, because of the interaction with the surround- 
ings the energy of the system is not constant. Hence it is senseless to speak 
about a well defined energy of the system, and it should be said only that its 
energy lies within e and e + ôe. To energy values of the system lying within 
the range e and e+ ôe there corresponds a certain number (2(€)5¢ of quan- 
tum states. We shall often call Q(e) the number of quantum states corre- 
sponding to the energy e of the system, or the degeneracy of a given state. 
This should not lead to misunderstanding. In fact, we shall assume the above 
definition. It is obvious that to different energy values e there corresponds a 
different number Q(e) of quantum states. It is different also for different 
physical systems. 

In the case where the system is an individual atom or molecule, the 
number of quantum states corresponding to a given energy is small for small 
excitation energies, but increases rapidly with increasing energy. If the system 
is a macroscopic body, there is always a practically continuous energy spec- 
trum. 

In what follows we shall make use of the actual value of Q(e) for the 
simplest systems, as derived in ch. 1. 

As a result of complex and random interactions between the system and 
its surroundings (the reservoir) the states of the system will change and it will 
go over from one quantum state into others. The system will perform transi- 
tions between different states corresponding to a given energy value e (more 
precisely, corresponding to an energy lying within the range e, € + ôe) as well 
as between states with different energies €} , E2, .... 

For example, in the case where the subsystem is an individual molecule, 
collisions with other molecules and the wall of the container, forming the 
thermostat, lead to transitions into other states with the same energy (changes 
in the direction of motion only) or into states with a different energy (inelas- 
tic collisions or elastic collisions with a relatively large momentum transfer). 

If one observes the change of the system during a relatively long time 
interval, then it will pass through all possible states. The state of the system at 
a given instant will depend neither on its initial state nor on the initial state 
of the reservoir. The effect of initial conditions will be completely concealed 


* The problem of the transition to classical systems will be considered more pre- 
cisely in Part V. 


| 





12 STATISTICAL DISTRIBUTION Ch. 3 


by complicated transitions, interactions, and so on. Hence we can say that 
the state of the system at any instant will be determined by the complex 
pattern of random interaction between the system and its surroundings. 

If attention is first drawn to transitions between different states belonging 
to a given energy value (between e and e+ôe), then it is physically obvious 
that all these states are mutually equivalent and that no one can have any 
advantages over others. The equivalence of the states of the system belonging 
to a given energy is a generalization of the proposition of molecular disorder 
in an ideal gas. In fact, the latter means that all states with the same energy 
but with different directions of motion in space and different positions in the 
container are equally probable in an ideal gas. 

If the system is observed during a sufficiently long time interval, then, 
since all states with a given energy are equivalent, it will appear in all these 
states, irrespective of the state which it was in initially. Moreover, since the 
transition of the system from one state into another occurs as a result of 
accidental perturbations and the action of its surroundings and all quantum 
states belonging to a given energy are completely equivalent, it can be said 
that the system will have an equal chance to get into each of them. Hence 
the time during which the system is in each of the quantum states belonging 
to a given energy-is the same for all these states. To characterize a state, one 
usually gives not the time interval during which the system is in it, but the 
ratio of this time to the overall observation time, i.e. the probability of re- 
alization of a given state. Then the preceding statement can be formulated 
briefly as the following principle: in a quasi-closed system all quantum states 
belonging to a given energy (lying between e and e + ôe) are equivalent. This 
statement is called the law of equal probability of elementary quantum states. 

The question as to whether, in fact, a macroscopic system can be in all 
states with a given energy, without exceptions, has been the object of dis- 
cussion of physicists and mathematicians over the course of a number of 
years. 

Systems which can enter any state of a given energy during a sufficiently 
long time interval are called ergodic systems. If at the initial instant the 
ergodic system is in a certain state, then sooner or later it will get into any 
other state previously chosen out of a given group of states (ergodic hypo- 
thesis). Although this proposition seems to be very likely, there is no proof 
for it. Difficulties arising in connection with this hypothesis are overcome in 
quantum statistics by considering idealized systems in which transitions into 
some of the states are forbidden by selection rules. These rules are understood 
to be certain restrictions of the possibility of transitions due to various 


§15 THE PROBABILITY OF A STATE OF A SYSTEM 73 


causes, which we cannot go now into *. Then all the microscopic states can 
be divided into two groups — those between which transitions are possible, 
and those into which systems of the first group cannot get. If the second 
group of states is indeed completely forbidden, then, in considering the 
properties of the system, the states of this group can be assumed to be non- 
existent. On the contrary, the system sooner or later gets into all states of 
the first group. 

In reality, there are no such idealized systems in nature. Also, more or 
less probable transitions between any states of the system are possible. The 
probabilities of these transitions can be considerably different from each 
other. Imagine, for example, a system consisting of atoms which can be in 
different states and which can interact with one another. Various processes 
can take place in the system as a result of this interaction, provided they do 
not contradict the basic laws of motion. For example, a fraction of the 
atoms can obtain energy from other atoms and pass over into an ionized 
state. However, such processes occur with such a low probability and, hence, 
proceed so slowly that in considering the behaviour of the system during 
any practically feasible time interval in terrestrial conditions, we can com- 
pletely disregard the possibility of their occurrence. 

The same thought can be expressed as follows: the transition of the system 
into a state corresponding to bare nuclei and free electrons is forbidden, and 
only the states of the system consisting of atoms should be considered. Since 
our purpose is the calculation of mean values, and the contribution of highly 
improbable states to them is very small, we shall not commit any significant 
error by ignoring completely transitions into forbidden states which are 
possible, but improbable. If it is assumed that transitions are possible be- 
tween other (allowed) states and that the system spends a sufficiently long 
time in all these states (in this restricted form the ergodic hypothesis is 
physically sufficiently convincing), then the principle of equal probability of 
all microscopic states can be illustrated by the following considerations. 

Assume that there is a system of particles which can be in states 1, 2, ..., 
i,... with one and the same energy. Let us denote by N; the number of parti- 
cles in the ith state and by N, the number of particles in the kth state. 
Further, let w be the probability of transition of a particle from the ith 
state into the Ath state. This probability can be calculated by means of the 
laws of quantum mechanics. Finally, let wọ; be the probability of transition 
from the kth state into the ith state. 


In quantum mechanics it turns out that all processes taking place with 


* See Part V, §106. 


a ca se 


74 STATISTICAL DISTRIBUTION Ch. 3 


individual microscopic particles are strictly reversible, so that the probabilities 
of a direct transition wz and a reverse transition w;; are always equal to each 
other *. This is the so-called principle of microscopic reversibility. The num- 
ber of particles passing over from the ith state into the kth state per unit 
time is, obviously, equal to the number N; of particles in the ith state multi- 
plied by the probability wz of the transition. The number of reverse transi- 
tions is correspondingly equal to V,w,;. In the stationary state the number 
of direct and reverse transitions must be the same, since the system must 
remain on the average in a steady state. 


Nw = NkWki > (15.1) 


but, by virtue of the principle of microscopic reversibility wz = wz;, whence 
it follows that N; = Vz. 

This reasoning can be extended to all other states between which there are 
transitions (i.e. wg #0). It turns out that the numbers of particles in all 
states must be equal to one another and, consequently, all quantum states 
must be equally probable. 

If we now pass over to transitions between states of the system with dif- 
ferent energy (more precisely, with an energy differing by an amount much 
larger than the interaction energy Ejnt), then it can be stated that, because of 
the open character of the system, it can also make such transitions. 

In transitions into states with a higher energy the energy difference will be 
drawn by the system from its surroundings. In transitions into states with a 
lower energy the energy excess will be transferred to its surroundings. The 
probability of such transitions is ensured by the existence of an interaction. 
It is just the interaction that is the cause of transitions of the system from 
one state into others. These considerations are a generalization of the prin- 
ciple of molecular disorder, which we have used in the kinetic theory of 
gases. However, it should be stressed that the reasoning given is more physi- 
cally likely than strictly substantiated. Hence the ergodic hypothesis should 
be assumed as a postulate the validity of which is proved by comparison of 
theory with experiment. In any case, for statistics it is sufficient that this 
hypothesis be fulfilled approximately in most cases. 


* See Part V, §56. 


§16 THE GIBBS DISTRIBUTION 75 
§ 16. The Gibbs distribution 


We now ask the question: what is the probability w; of finding our system 
in states with an energy lying between e; and e; + ôe; (where ôe; < ej, and 
the subscript į runs through the values 1, 2, 3,...)? To each value of the energy 
€; there corresponds a certain group $2(¢;) of quantum states. 

Consider at first a closed system which does not interact with the bodies 
surrounding it. In reality, no perfectly closed systems can exist in nature. 
Whatever the physical nature of the system may be, it always interacts, if 
only very weakly, with the bodies surrounding it. In quantum mechanics it 
turns out that a system can have a strictly constant energy only when it is 
in the ground state (for a macroscopic system this corresponds to the state at 
absolute zero; see §34 of Part V). 

Hence we shall understand a closed system to be a system whose energy 
during the entire time of observation remains within a given narrow range 
ôcj. 

Since all states with a given energy are equally probable, the probability 
that a closed system be found in one of the states with given energy is simply 
proportional to the number of states with that energy: 


whei) ~ Q(e;) - (16.1) 


Formula (16.1) is called the Gibbs microcanonical distribution. The micro- 
canonical distribution shows that the probability for a closed system to be in 
a state with given energy is proportional to its degeneracy. 

The Gibbs microcanonical distribution is the basis of statistical physics. 
It shows that a closed system will be found with greater probability in a state 
with a higher degeneracy. 

In phase space the states of a closed system lying in a narrow interval ôe 
form a very thin layer, which reduces to a constant-energy surface as ĝe > 0. 
Each cell in the constant-energy layer (surface) corresponds to each possible 
quantum state. 

As an example of the application of the microcanonical distribution we 
shall consider a system of N particles which do not interact with one another 
and which can be in two different states. Specifically, we shall speak of 
particles with a spin one half (in units of h/27). In this case the projection of 
the spin of every one of the particles onto an arbitrary axis can take on two 
values: s4 =4 and sy = —}. We shall conditionally call these spins upward and 
downward directed respectively. In the absence of an external magnetic field 
the energy of the system depends neither on the orientation of the spins of 


[i 


ae 





76 STATISTICAL DISTRIBUTION Ch. 3 


the particles nor on that of the total spin S = £ s of the system. Hence to a 
given value of the energy of the system e there corresponds a large number of 
different states corresponding to different orientations of the spins of the in- 
dividual particles. 

According to what has been said, all states with a given distribution of spin 
orientation are equally probable. 

We apply formula (16.1) for binding the probability for a system of N in- 
dependent particles with a total spin S. The total spin S of the system is ob- 
viously equal to s(V; —N3) = sn, where Nj and N3 are the number of particles 
with spin oriented upward and downward respectively, and n =N] — N3. 
Since NV, + N2 =N, there correspond to a total spin S, 4(N+n) particles with 
a spin directed upward and }(V—7) particles with a spin oriented downward. 
Let us find the number (7) of independent distributions of 4(N+m) particles 
with one spin orientation and }(V—n) particles with the other spin orientation 
for a given total number N of particles. 

Each of these distributions leads to one of the equally probable states of 
the system, so that the probability of the state with spin S which interests us, 
is according to (16.1), equal to 


w~ Q(n) . 


N independent particles can be distributed in a given order in N! ways. The 
mutual permutation of particles with one and the same spin orientation, i.e. 
}(N+n) particles with a spin oriented upward and $(—n) particles with a 
spin oriented downward, does not change the total spin. The number of such 
permutations is equal to [3(+m)]! and [}(V—n)]!. Hence the number of 
states of the system with a spin S is equal to the number of such independent 
distributions of M particles between the two states in which 4(N+n) of the 
particles are in the first state (with the spin upward) and (N-n) are in the 
second state (with the spin downward). Correspondingly, 


N! 
N+n\,(N=n)\, © 
2) ADIN 
To obtain a more obvious formula use can be made of Stirling’s formula 


(see Appendix IV) and it can be assumed that n < N. Then, upon taking the 
logarithm, we have 


w~ Q(2) = 


§16 THE GIBBS DISTRIBUTION 77 





N Ntn, Ntn N-n, N-n n2 
n n- In- =Nin2—-7, 
2 2e 2 2e 2N 





or 


w(n) ~ STEN 


We have arrived at the probability of a state expressed by a Gaussian dis- 
tribution. The factor of proportionality can be found from the normalization 
condition. It is obvious that the most probable state is the state with n= 0, 
i.e. the state in which the number of spins oriented upward is equal to that of 
spins oriented downward. This state is an analogue of the state of molecular 
disorder in a gas. The probability distribution has a sharper maximum at the 
point n = O the larger the total number of particles in the system. 

As we have already stressed, the Gibbs microcanonical distribution, 
establishing the probability of a given state of a closed system, is of funda- 
mental importance. However, in practice one has much more often to deal 
not with closed systems, but with subsystems in a reservoir. We shall there- 
fore turn to the consideration of such subsystems. 

A subsystem and a reservoir together form a closed system the energy of 
which (with the above reservation) can be considered to be constant: 


E= const. 


We are interested, however, not in the probability distribution for a com- 
plex system but in the probability distribution for the subsystem (for any 
probability distribution for the reservoir). In order to find this it is necessary 
to take into account the particular character of the interaction between the 
subsystem and the reservoir. 

As has already been pointed out, this interaction is weak, so that the 
energy of interaction can be neglected in the total energy balance. Writing 


the latter in the form 


B= BO) + e; = almost a constant , (16.2) 


where gO) is the energy of the reservoir in the kth state, e; is the energy of 


UARIIS 


ne 








78 STATISTICAL DISTRIBUTION Chis 


the subsystem in the ith state, and the term “almost a constant” underlines 
the fact that terms expressing the interaction between the subsystem and the 
reservoir, as well as between the complex system and the bodies surrounding 
it, are omitted in the energy conservation law (16.2). 

The neglect of the energy of interaction between the system and the reser- 
voir means that we can consider the quasi-closed system and the reservoir to 
be independent systems for an overwhelmingly large part of the time. 

The subsystem can be in any of Q(e;) states with an energy e;, while the 
reservoir can be in any of QE) states with an energy EXO). 

If the transitions mentioned do not take the system out of the group of 
states with an energy €; and the reservoir correspondingly out of a state with 
an energy BO then a change in the state of the subsystem in no way affects 
the state of the reservoir and, conversely, a change in the state of the reservoir 
has no effect on the state of the subsystem. On the other hand, by virtue of 
the energy conservation law (16.2) the energy of the reservoir and that of the 
subsystem are unambiguously interrelated. If the subsystem has an energy 
€j, then the reservoir must have an energy EO 

After these remarks we can now turn to finding the probability that the 
subsystem be in one of the states with an energy €j. 

From the last remark this probability w; is equal to the probability that 
the complex system (subsystem + reservoir) be in such a state that the sub- 
system has an energy e€; and the reservoir an energy BO) Since w; is the 
probability of a given state of a closed system, it is expressed in terms of the 
number of states according to formula (16.1): 


wi ~ QE) = UE +e;). (16.3) 


On the other hand, the number of states of a closed system consisting of 
two independent parts is equal to the product of the number of states of the 
two parts, i.e. 


EO +6;) = Qo(E-€;) AE;) - (16.4) 

Here, in the expression for the number of states of the reservoir Qo we have 

written as the argument the expression Æ-— e; on the basis of (16.2). 
Substituting the expression (16.4) into (16.3), we find 

wi ~ QQE- Ei) QKe;) . (16.5) 


The very weak interaction between the system and the surroundings serves 


§16 THE GIBBS DISTRIBUTION 79 


as the cause of transitions of the system from one state into another. Since 
the dimensions of the reservoir are very large in comparison with those of the 
system, we can assume that the energy g% of the reservoir for all values of 
k is also very large in comparison with the energy of the system. Hence, what- 
ever the changes of the energy of the system may be, the energy of the reser- 
voir can be considered to be almost constant. All different states in which 
the reservoir is found when the system passes from one energy state into 
another can be assumed to have one and the same energy. 

Because of this we can expand {29(£—e,) in a series in powers of the small 
quantity €; and confine ourselves to the first term of the expansion. However, 
it should be noted that the function Q9(£—e;) cannot itself be expanded 
directly in a power series. Indeed, we know that the number of states is a 
multiplicative function, while the energy is an additive function. The number 
of states of a system consisting of independent parts is equal to the product of 
the number of states of these parts, while the energy is equal to the sum of 
the corresponding energies. If we expanded {29(£—e;) in a series in powers of 
the small quantity €;, then we would obtain the expression 


Qy(E-€;) ~ Qo(E) — (16.6) 


OE ae 


which does not possess the required properties. If, for example, we considered 
two systems with the numbers of states Q0) and 2) and energies (E)_ eD) 
and [E y 2) respectively, then the number of states should be equal to 
2a = the energy should be equal to [EO_ D4) e a . However, 
in PIDE the left-hand sides of the expansion (16.6) the right-hand sides 
are not multiplied. 

Hence before expanding the number of states (29(#—e;) in a series we write 
it in the form 


E em °F , (16.7) 


where o(£—e;) is a new function of the argument (£~e;). Such a representa- 
tion is always possible, since the number of states by its very nature is an 
essentially positive quantity, the values of which are obviously not less than 
unity. 

Writing o(£—e;) in the form 


o(E—e;) = In Qo(E—e;) , (16.8) 





80 STATISTICAL DISTRIBUTION Ch. 3 


Expanding o(£—e;) in a series in powers of the small quantity e; and con- 
fining ourselves to the first term, we have 


do Gy 
o(E—€;) ~ o(E) — 5p €i = 90 = a 4 


where @ stands for the quantity 


_ (2E 
o= (22) (16.9) 


Then for 29(£—e;) we find 
1p (B=) =e) eH (16.10) 


It is easily seen that (16.10) satisfies the requirement of the product of Q with 
the addition of the energies of independent systems. 
Substituting the expression (16.10) into (16.5), we have 


w; = const e il? 2(E;) , (16.11) 


where const stands for the product of the factor of proportionality and the 
quantity e?£), which does not depend on the value of e; or the properties of 
the subsystem. 

Formula (16.11) determines the probability that a certain system, re- 
presenting a small weakly interacting part of a certain ensemble of arbitrary 
physical systems, be found in one of Q(e;) states with an energy lying be- 
tween €; and €; + de;, and that the reservoir be found in one of the states 
with an energy lying between Æ — e; and E —(e;+5e;). Since the state of the 
reservoir is of no interest, we shall for brevity say that w; is the probability 
for the subsystem to be in one of the states with energy -£;. 

From the definition of the probability it follows that the following norma- 
lization conditions must hold: 


2 w=1, (16.12) 


where the summation is carried out over all possible quantum states of the 


system. 
From the normalization condition and the form of w, if follows imme- 


§16 THE GIBBS DISTRIBUTION 81 


diately that the coefficient 0, introduced formally, is an essentially positive 
quantity. Only if this is so does the probability of states of arbitrarily large 
energies tend to zero, as is to be expected from the very meaning of the no- 
tion of physical probability, and as follows formally from the normalization 
condition. The constant in (16.11) can be found from the normalization 
condition. Substituting (16.11) into (16.12), we find 


1 
2 e <il8 QUe) i 


Hence the probability distribution can be put in the final form 


const = 


ei Q(E;) 
w; = <——__—__.. 
Dy eile XeE;) 


The distribution (16.13) is the distribution sought, and will serve as the 
basis of all further discussion. It was first found by Gibbs in 1901 for systems 
obeying the laws of classical mechanics. This distribution is called the Gibbs 
distribution or the canonical distribution. The transition from quantum states 
possessing a discrete set of energy levels to classical systems presents no 
difficulty and will be made in one of the following paragraphs. The quantity 
@ figuring in the Gibbs distribution is called the distribution modulus or statis- 
tical temperature. 

The Gibbs distribution describes the probability distribution of different 
states of a subsystem representing a small quasi-independent part of an 
arbitrary system in a state of statistical equilibrium. It should be stressed 
that, if the system is not in an equilibrium state, then all the foregoing 
reasoning is invalid. The principle of equal probability of states with a given 
energy is inapplicable to a non-equilibrium system. 

The sum > exp (—e;/0) Q(e;) figuring in the denominator of (16.13) will 
play an important role in what follows. We introduce for it the special nota- 
tion 


(16.13) 


Z=2, e Pe), (16.14) 


and call it the partition function, since all states of the system give a contri- 
bution to it. In the literature it is often called the sum over states, or the 
statistical sum. However, this terminology does not appear to us to be quite 
apt. By introducing the partition function the Gibbs distribution can be 
written in the form 


ETE es ome eee 


82 STATISTICAL DISTRIBUTION Ches 


w= Zo} ei Q(e;) . (16.15) 


The Gibbs distribution for any concrete physical system can be considered 
to be known if the energy levels of the system, i.e. the possible values of the 
energy €;, and the degeneracy of the states of the system, i.e. the number of 
different states Q(€;) corresponding to a given value of the energy €j, are 
known. For a number of systems which will be considered below these 
physical characteristics can be found. 

A remarkable feature of the Gibbs distribution is the fact that the 
mechanism of interaction of the subsystem with its surrounding in no way 
figures in it. - 

By means of the Gibbs distribution one can calculate the mean value of any 
quantity depending on the state of the system. If L(e;) is the value of a 
certain physical quantity for the states corresponding to an energy €j, then, 
according to the general laws for finding a mean value, we can write 


L= 2; L(e;) w= Z7! DD; L(é;) e 9 Q(e;) = 


_ D; L(e;) eu) 2E;) 


» ei 2E;) 


(16.16) 





§17. The statistical temperature 


Let us first of all consider the properties of the distribution modulus. 6 
which we have already introduced. From its definition it follows that it 
characterizes the properties of the entire ensemble of systems, i.e. the 
reservoir, and not the subsystem that we singled out. Indeed, in formula 
(16.9) there appear only quantities referring to the entire ensemble of sub- 
systems: its energy Æ and the function ø, whose value (04/00) is taken for 
€;= 0, so that o = o(£). Hence the modulus @ always refers to a macroscopic 
system and is a function of the state of this system. When the state, and in 
particular the energy of the entire system changes, the distribution modulus 
@ changes. Since the function ø, determined from formula (16.8) and re- 
presenting the logarithm of the number of states with a given energy, is a 


§17 THE STATISTICAL TEMPERATURE 83 


single-valued function of the state (energy) of the system, 0 is also a single- 
valued function of the energy of state of the system. 

Further, the distribution modulus @ is an essentially positive quantity. 
Indeed, if the energy e; can assume any arbitrarily large value, then the 
probability of a state with given energy e; must decrease with increasing 
energy. If that were not so, the normalization condition (16.12) could not 
be fulfilled. 

Thus, 8 can refer only to a macroscopic system and is essentially a positive 
single-valued function of its state. We shall show that the distribution modulus 
@ is a characteristic of the equilibrium state in the system. For this we shall 
consider two subsystems belonging to different systems, having distribution 
moduli 0} and @3. Each of the subsystems will be assumed to be in a state of 
statistical equilibrium, so that the probabilities of their states are determined 
by formula (16.11): 


w] =A, e7€1/81 Qy P w =A9 e7£2/02 w ; 


We assume that both systems come into a weak interaction, so that an 
energy exchange can take place between them. The two interacting sub- 
systems can be considered as one unified subsystem. If the latter turns out 
to be in a state of statistical equilibrium, then the probability distribution of 
its states must also be described by a law of the form 


weAeQ. (17.1) 


On the other hand, since the interaction is weak, the interaction energy can 
be disregarded and each of the subsystems can be assumed to be quasi-inde- 
pendent. Then to find the probability distribution of the complex system 
use can be made of the theorem of multiplication, and one can write 


W=Wwyw7 =A] e 1/01 e7€2/02 QR. (17.2) 


In order that the distribution (17.2) may be identical with (17.1), it is 
necessary that 


6, = 65 =0. 
Thus, if two equilibrium subsystems with equal moduli 0} = 0% are brought 


into interaction, then a unified equilibrium system with the same modulus 
0=0; =0, will be obtained. If 0, were different from 83, then in establish- 





84 STATISTICAL DISTRIBUTION Chr 


ing the interaction a system would arise with a probability distribution ex- 
pressed by formula (17.2). This distribution is not the Gibbs distribution for 
a system with energy €=€, + €>. Hence the system formed when 0; #05 
will not be in an equilibrium state. The equilibrium state is not violated when 
an interaction is established between subsystems, if their moduli 0; and 05 
are equal to each other, and is violated if 0; #05. 

That is why the quantity @ is called the statistical temperature. In the 
case where the subsystem contains such a large number of particles that it 
can be considered as macroscopic, one can also speak of its proper statistical 
temperature. Its temperature is determined from the condition of equilibrium 
of the subsystem and the reservoir and, consequently, is equal to the temper- 
ature of the latter. Therefore, @ can, for brevity, be called the temperature of 
the system. 

It goes without saying that, if a quasi-closed subsystem contains an insuf- 
ficiently large number of particles, then the notion of its temperature be- 
comes approximate and, in the case of a subsystem represented by an indi- 
vidual molecule of an ideal gas, it completely loses any meaning. 

The value of the statistical temperature is determined by formula (16.9) 
and depends on the energy of the system. It is, in the general case, impossible 
to find the form of this dependence, since it is determined by the particular 
properties of the system. However, in practice one is interested not in the 
dependence of @ on Æ but in the reverse, the dependence of the energy on 
the temperature £ = £(@). In what follows we shall see that the energy is a 
monotonic function of temperature. We shall find the concrete form of the 
dependence of the energy on the temperature @ for certain simple systems 


(gas, ideal crystal and so on). 


§18. The properties of the Gibbs distribution and statistical equilibrium 


The Gibbs distribution characterizes the probability distribution of dif- 
ferent states of a quasi-closed system. The conditions for the Gibbs distri- 
bution to be applicable are: 

(1) the presence of a certain macroscopic system constituting the surround- 
ings (reservoir) of the system considered. 

(2) the presence of a weak interaction between the system and the reser- 
voir. 

Other properties of the system are completely arbitrary. 

The Gibbs distribution, as well as the Maxwell energy distribution, has a 
maximum at a certain energy value. At first sight the existence of this maxi- 


§18 PROPERTIES OF THE GIBBS DISTRIBUTION 85 


mum is not obvious: the exponentially decreasing factor exp (—e;/@) figures 
in the Gibbs distribution. 

However, it should be recalled that the number of states with given energy 
Q(e;) increases rapidly with the energy of the system. The more particles the 
system contains, the more states Q(e;) correspond to a given energy interval 
€;, €;+6e;. Hence the increase of Q(e;) with energy proceeds the more 
rapidly the more particles there are in the system. As will, for example, be 
shown in §20, if the subsystem is a gas consisting of NM independent mon- 
atomic molecules, confined in a container with a constant temperature (reser- 
voir), then Q(€) ~ EIN, 

The product of two functions — the one decreasing rapidly and the other 
increasing with energy — leads to the appearance of a sharp maximum in the 
Gibbs distribution. This maximum is sharper, the more rapidly Q(e;) in- 
creases, i.e. the more particles there are in the system. We shall see in the 
same example that if the system is macroscopic, so that it contains a very 
large number of particles, then the width of the maximum is negligible. It is 
so sharp that it is impossible to represent the Gibbs distribution graphically 
without distorting the scale. This means that the probability for the system 
to be in states with an energy differing appreciably from the energy Em.p 
= Emax» Corresponding to the maximum of the Gibbs distortion, is negligibly 
small (fig. 111.11). The system spends an overwhelmingly large part of the 
time of observation in states with an energy very close to the latter. The state 
corresponding to the maximum of the Gibbs distribution is the most probable. 
The most probable state will give the basic contribution to the mean value 
of the quantities characterizing the system (for example, the energy). This 


We) 6 


Emax 


Fig. UL.11 








86 STATISTICAL DISTRIBUTION Ch. 3 


follows from the very definition of the idea of mean value: each state gives 
to the mean value a contribution proportional to its probability. 

Hence in the case of a macroscopic system the partition function Z can be 
written in the following form: 


Z=2) ei Ue; ~ 7mp XEmp) ~ er ae), (18.1) 


where only one term, the largest, referring to the most probable energy, is 
retained in the sum over states. Here we have made use of the approximate 
equality of the most probable energy and the mean energy in a macroscopic 


system. 
Analogously, for the mean value of any quantity L we can write 


D= 27 Lle) wei) ~ Llemp) ¥ LO» (18.2) 


ie. the state with e=,,,, is realized with a probability w(€m.p) = 1, while 
probability of other states € Æ Emp is close to zero. The mean value of all 
quantities will be close to their most probable value. This refers, in particular, 
also to the energy € ~ Emp Of the system. This result is in a complete agree- 
ment with the general conclusions arrived at in §5 on the properties of sys- 
tems containing a large number of particles. The true values of all quantities 
are close to their mean values, and the latter are close to the most probable 
values. 

The presence of a sharp maximum in the Gibbs distribution is a concrete 
manifestation of the general properties of systems with a large number of 
particles considered in §S. 

Systems which are in such a state that the true values of the quantities 
characterizing them are close to mean values are called systems in a state of 
statistical equilibrium. 

Thus we see that every macroscopic quasi-closed system described by 
the Gibbs distribution is in a state of statistical equilibrium during most of 
the time of observation. 


§ 19. Transition to classical statistics 


In most cases we shall have to deal with systems the energy levels of which 
are so close to each other that they can be considered to be distributed con- 


§19 TRANSITION TO CLASSICAL STATISTICS 87 


tinuously. Then the set of discrete values of the energy levels €} , €9,..., ;,-.- 
can be replaced by a continuous function e. In other words, we shall pass over 
from the quantal description of the system to the quasi-classical one in the 
sense explained in §1. 

From the Gibbs distribution it follows that, in order to replace a dis- 
continuous function exp(—e,/0) by a smooth function exp(—e/6), it is 
necessary that the magnitude of the steps, i.e. of the spacing between the 
levels Ae; = €;,; — €; should be small in comparison with the value of 8. 
Thus, other things being equal, the transition to quasi-classical statistics should 
take place in the range of high temperature. We shall frequently return to this 
statement in what follows. The application of the classical approach is of 
great importance and is often encountered in practice, since, in ordinary con- 
ditions, the energy spectrum of every macroscopic system is almost contin- 
uous. 

The statistical physics of systems obeying the laws of classical mechanics 
is called classical statistics. 

In the classical approximation we have to replace a discrete set of proba- 
bilities for different states by a continuous distribution. In the quasi-classical 
approximation the state of a system of N particles having 3N degrees of free- 
dom is determined by the value of the coordinates q1, q2,.--, (3y and 
P1» P2»: P3y- The energy of the system e(p,q) is expressed as a continuous 
function of all the coordinates and momenta. Since energy in the classical 
approximation can be considered to be a continuous function, the probability 
distribution of different states of the system is also expressed by a continuous 
function. Namely, in the classical approximation one can give the probability 
dw for a system to have an energy lying between e(p,q) and e(p,q) + de(p,q). 
According to (1.26), there corresponds to an energy lying in this interval a 
number of states dQ = A73 (Ər /ðe)de, where N is the number of particles in 
the subsystem. 

In the classical approximation the Gibbs distribution can be written in the 
form 


Bons e 


dw = 5 19.1 
Z Je h3N Ce) 


Here according to (16.14), the partition function of the system can be 
written in the form 





88 STATISTICAL DISTRIBUTION Chas 


=A feao A ae. (19.2) 


h3N de 


The difference between (19.2) and (16.14) lies in the fact that in the classical 
formula for Z an integration is carried out instead of a summation. Z is often 
called the statistical integral. The integration in (19.2) is carried out over the 
entire phase space available for the system, i.e. over all allowed values of the 
coordinates and momenta. Which values of the coordinates and momenta are 
allowed depends on the particular properties of the system and on the condi- 
tions in which it is found. 
Substituting (19.2) into (19.1), we have 


-e(p,q)/6 -€(p,q)/8 
Eee (Ər/ðe)de _ e dr (19.3) 


fPD (arjac)de feto ar 





From this it is clear that Planck’s constant drops out of the Gibbs classical 
distribution dw, as was to be expected. The Gibbs classical distribution is 


often written in the form 
dw = p(p,q) dP , (19.4) 


where p(p,q) is the normalized probability density: 


-e(p,q)/é dr 
Beane E ayar= 1. (19.5) 


Jeee dr 


Here dw represents the probability of a given state of the system, i.e. the 
probability for the representative point of the system to be in a given ele- 
ment of phase space. 

In other words, dw represents the probability for the system to be in a 
state in which its momenta and coordinates lie in the intervals p;,p,+dp),... 
P3n>P3nt4p3y and 41,41 +4913... 43y, 43+ 43y: 

Consider, in particular, the case when the quasi-closed subsystem is an 
individual molecule in an ideal gas. Then the energy of the subsystem e(p,q) 
represents the energy of this molecule. For M= 1 our subsystem will have 
three degrees of freedom. Correspondingly, its phase space will have six 
dimensions. An element df’ of the phase volume will have the form 


§19 TRANSITION TO CLASSICAL STATISTICS 89 


dI = dp, dp,, dp, dx dy dz 


or, in spherical coordinates for the momenta, replacing for conciseness dx dydz 
by a volume element dV, 


dl = p? dp sin@ dd dy dV. 


If there are no external fields of force the energy of the molecule reduces to 
its kinetic energy: 


€(p,q) = p2/2m , 
and depends neither on the direction of its motion (angles 0 and y) nor on 


its position in the container. Hence there corresponds to an energy lying in 
the interval between e and e+de a number of states equal to 


C 
2 dp JdV 4nV d 
dQ = Pe See si = Dp Go is 
den3 Jetta 0 dé dy de n3 Bide de 


The calculation of dp/de gives 


3 1 
3 1 
dQ = 4nVm2 (2e€)2 Lut 
h3 


Thus, the Gibbs distribution for one molecule has the form 


niw 


dy = 2mm? dO OV E. (19.6) 
zh3 


where z is the partition function of an individual molecule. Comparison of 
(19.6) with the Maxwell energy distribution (9.5) convinces us of their 
identity, provided the statistical temperature 0 is identified with the quantity 
kT. It should be stressed that the absolute temperature, appearing in the 
Maxwell distribution, refers not to an individual molecule (the subsystem) 
but to the entire gas (the reservoir). In §26 we shall show that this relation- 
ship between @ and T is of a general character. At first sight it may seem that 
the normalization constant in the Maxwell distribution differs from that in 
(19.3). In particular, it does not contain the Planck constant A. In reality, 
however, this is not so. In order to convince ourselves of this, we write the 
explicit expression for z: 


Ede 


90 STATISTICAL DISTRIBUTION Ch. 3 


3 co 
z= fe-o dh = 4nVmn* ff e-€/0 (26)2 = 








3 3 
h h o 
3 
4r Vm? w (22) 
= WA E = V. (19.7) 
h3 2 h2 
Hence 
dw=— z €7€/0 ede. (19.8) 
(103) 


Thus, the Planck constant vanishes from the distribution, and the constants 
in (19.8) and (9.5) are the same. 

As we have stressed in the preceding paragraph, the Gibbs distribution has 
a very sharp maximum for a certain value of energy. At first sight this state- 
ment contradicts the broad maximum in the Maxwell distribution. However, 
it should be borne in mind that the sharp maximum in the Gibbs distribution 
arises as a result of the competition of the exponentially decreasing factor 
exp (— €/0) and the increasing factor (e). The latter increases as ew, or as 
eÈ in the case when V= 1. Hence for V > | the quantity dw/de varies rapidly 
and a sharp maximum arises, while for NV = 1 it increases relatively slowly and 
the maximum in the distribution turns out to be broad. 

If the quasi-classical subsystem contains a very large number of particles, 
then the integral over states, figuring in formula (19.2), has a very sharp 
maximum for an energy value €max ~€, i.e. in the region of states corre- 
sponding to statistical equilibrium of the system. 

In this case, analogously to (18.1), we can write 


= AI AI 
Z% E€m.p/9 cap = 9.9 
Ẹ WN RON i 029) 


where APT is the volume of that region of the phase space which corresponds 
to a state of statistical equilibrium, i.e. to EX Emp: It is obvious that the 
number of states corresponding to statistical equilibrium of the system is 
equal to 


EA 


Kemp) Eray: (19.10) 


§20 MONATOMIC GAS AS A WHOLE 91 
§20. The monatomic gas as a whole 


The properties of the Gibbs distribution, described in §18 and 19, can be 
shown most clearly by a concrete example. 

Let us consider as a unique quasi-closed system, a gas as a whole, confined 
in a container of volume V. If the walls of the container are impenetrable to 
but can exchange energy with the gas molecules, then the walls of the con- 
tainer and the bodies surrounding the container form a reservoir. The entire 
container with the gas can be characterized by a definite temperature 0, equal 
to the temperature of the surrounding bodies. It can be assumed that the size 
of the latter and their energy are very large in comparison with the energy of 
the gas. 

We see that all the conditions for the applicability of the Gibbs distribution 
to a gas as a whole are present, and that this distribution can be written for 
the gas as a whole. We assume that the gas is monatomic, and that there are no 
external fields of force. Then the energy of the gas is equal to the sum of the 
kinetic energies of all the particles constituting it. This is given by the classical 
expression and is a continuous variable. Let the gas contain M molecules of 
mass m. The state of the system is completely characterized by giving the 
coordinates and momenta of all molecules 41, q2,- 3. P1; P2% P3N- 
The phase space of the system has 6N dimensions. An element dr of the 
phase space is equal to the product of the differentials of all the momenta and 
coordinates: 


dr = dp, ... dp3y dqy -.. dq3y - (20.1) 


The energy of the system depends only on the momenta of the molecules and 
can be written in the form 


€(P.4) = 5— OAD ERDEN (20.2) 
5 3N 


In order to write down the Gibbs distribution it is necessary to find an 
expression for the number of states corresponding to the energy of the 
system lying between e and e + de. According to the general formula (1.26’) 
we have 


or 


are 
dQ = SW oe (20.3) 





92 STATISTICAL DISTRIBUTION Ch. 3 


The Gibbs distribution for the gas as a whole has the form 


252 2 
avi Bitat Pan Jian 
w= NZ exp [ mm Ae de. (20.4) 





Let us find the value of ðr/ðe. The volume of the part of phase space in 
which the energy of the gas does not exceed e is, by definition, equal to 


p= WE see dp3y dq; eee dq3y : (20.5) 


In formula (20.5) the integration range is determined in such a way that the 
condition 


2 2 2 
Limnos AN, 
= : 

= p.4) (20.6) 
is fulfilled. This condition does not involve the coordinates of the molecules, 
with respect to which one can integrate directly. This gives 


r= YN fap: ~- dP3y > (20.7) 


where V= f dq; dq3dq3 is the entire volume of the gas. 

Formula (20.6) determines, from the geometrical point of view, in the 
space of 3N dimensions a sphere whose radius is equal to R = (2me)2. Then 
the integral in (20.7) represents the volume of this sphere. The dependence 
of the volume of the sphere of 3N dimensions on its radius can be found 
from the consideration of dimensionality. Namely, it must be proportional 
to the radius to a power equal to the number of dimensions. In the three- 
dimensional space it is proportional to R3, and in the 3N-dimensional space 
it is proportional to R3. Hence (20.7) can be written in the form 


T= const VYR3N = const VNe?N . (20.8) 


Differentiating (20.8) we have 


3 
oP = const VNewN-! (20.9) 
de - 
The value of the constant in (20.9) is of no particular interest, since it will be 
cancelled with the same constant arising in calculating Z. Hence, from (20.4) 
and (20.9) we finally have, 


§20 MONATOMIC GAS AS A WHOLE 93 


oe = e-€/0 e? N-1 YN de. (20.10) 
h 


The Gibbs distribution function for a system with a large number of par- 
ticles M has a very sharp maximum, since the factor ełN-1 ~ e34 increases 
very rapidly with increasing e, whereas the factor exp(—e/0) decreases 
sharply. We shall find the position, width and height of this maximum. 

The maximum of the expression (20.10) is at a point determined by the 
condition 


3y- 
ads SEN E wisi? eA 


3 
d n ; +(3N-1)e/9 2N-1=9. (20.11) 


Hence we find that the condition for the maximum reads 
€max 
See ADE 
or 


Emax = €mp = @N=1) 0, 


where Emax is the energy at the maximum. Since the number of particles V 
is very large, unity can be neglected in comparison with 3N, so that 


Emax = Emp “NO. (20.12) 


It is easy to show that the quantity 3V@ represents the mean energy of the 
entire gas. By definition, 


€= Sedw = const YN feelo ÈN- ede ~ 


h 
co 
If e-€/0 ÈN de oo 
const VV ck si Oneal E ton Sa Flared I e7€/0 ÈN- de. 
co - 
h3N const VN f e-elð ÈN- de a(—0-1) ¢ 
hN (20.13) 





94 STATISTICAL DISTRIBUTION ChS 


The integral in (20.13) is calculated in Appendix IV. This calculation leads 
to the relation 


E=2N0 . (20.14) 


Nw 


Comparing the expressions (20.12) and (20.14), we see that the most probable 
energy lies very close to the mean energy. If N is sufficiently large, then these 
energies can be identified with each other to a high degree of accuracy. Thus, 
the subsystem (ideal gas) during an overwhelmingly large part of the time is 
in the state in which its energy is equal to the mean energy €. This property is 
not possessed by a subsystem containing few particles. For example, for one 
molecule the difference between the mean and the most probable energy is 


relatively large. 
In order to picture how sharp the maximum in the Gibbs distribution is, 


i.e. how often the subsystem can get into a state with an energy different 
from the most probable one Emax, we shall find the form of the distribution 
function near the maximum. In the vicinity of the maximum, when the dif- 
ference €—€,,,, is small, the distribution function can be expanded in a 
series in powers of € — Emax and we can restrict ourselves to the first terms of 
the expansion. If the distribution function (ignoring the immaterial constant) 
is denoted by f, then 


fzede AN-1 = ¢-€0+(5N-}) Ine = eX) , 
where 
y(e) = -3 + GN-1) Ine. 


Since at the point € = Emax the distribution function f and, therefore, also the 
function y has a maximum, for y(€) near this maximum one can write the ex- 
pansion 


de) ~ Kema) + ($2) 


2 
+4 Es) 
de? / e=emax 


€=€max (€~€max) i 


(Emax)? te = 


dy 
a AEmax) + 4 (ees 


Card 
de? as a 


§20 MONATOMIC GAS AS A WHOLE 95 


It is easy to see that 


d2 l 
o 
de* / e=emax E max 





and, consequently, 


2 
max 


=P 
f exp [(Emax)] exp [67-9 Se) 
2e 


2 
2E max 


3N-1 (€—€max)? 
= exp [—€n9x/0] Extn, X exp| -GN- 1) =] : 
Thus, the probability distribution near the maximum has the form 


dw = const 





1 6] N-11 x 
3NZ exp [-Emax/0] Efax 


h 


—(2N— = )2 
X exp [ee] de (20.15) 


2 
2e ax 


The dependence of the probability distribution on the distance from the maxi- 
mum (E—€Emax) is characterized by the second exponential factor in (20.15). 
It represents a symmetric function of the type 


| Eset _ Emax 
exp | - ——————|,_ where © =r o 
262 (3N—1)2 


The quantity ô represents the width of the maximum. At the value 
(€—€y ax) = 5. the distribution function is e times smaller than at the maxi- 
mum. The relative width of the maximum is equal to 


1 
6 6 1 2 F 

~~ z% 20.16 
€max E @n-1)2 (ay ( ) 





For values of N corresponding to the number of molecules in a macroscopic 
volume (V~1019), the width of the maximum of the Gibbs distribution 
turns out to be very small. This means that the Gibbs distribution has a very 
sharp maximum at Emax- The gas is, with a high probability, in the state in 
which its energy is equal to the mean energy €. The probability that 1 cm3 of 





96 STATISTICAL DISTRIBUTION Ch=3 


the gas will be found in a state with an energy differing from €, for example 
an e equal to 99% of €, can easily be determined from formula (20.15) or 
(20.16). Its ratio to the probability of the state e = Gis as 


15 
1: exp [4(2V-1) (0.99-1)2] = 1 : e10 


Thus, an appreciable deviation of the energy of the gas from the mean 
value does not in practice occur in a gas containing a large number of particles. 
This conclusion is in complete agreement with what was said in the previous 
paragraph, as well as with the general theorem of §5. Comparing the relative 
width of the maximum of (20.16) with the determination of the relative 
energy fluctuation of §5, we convince ourselves of their complete equivalence. 





Statistical and Phenomenological 


Thermodynamics 


§21. The internal energy of a macroscopic system. The first and second 
laws of thermodynamics 


Turning from the apparatus of Gibbs statistics, we now consider the 
formulation of a theory of the thermal properties of matter. Before 
proceeding to carry out this programme, it is necessary to dwell briefly on 
the history of the development of the theory of heat. 

The development of technology and the extensive use of heat engines in 
the first half of the 19th century demanded the development of a theory 
of thermal processes. However, the ideas of the nature of heat were still very 
vague. The physics of the first half of the 19th century was still very far 
from the construction of a theory of thermal processes on the basis of mole- 
cular concepts. Hence the development of the theory proceeded in a very 
unusual way. 

Joule’s experimental establishment of the mechanical equivalent of heat 
and the failure of all attempts to create a perpetual motion machine 
(perpetuum mobile), by means of which it would be possible to obtain use- 
ful work without any changes in surrounding bodies, allowed one to postulate 
a general principle called the first law of thermodynamics. The first law of 
thermodynamics represents a particular case of the energy conservation law as 
applied to thermal processes. 


97 


98 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


If the amount of heat absorbed by a system and expressed in mechanical 
units is equal to 6Q, then the first law of thermodynamics reads: 


5Q=—5W+5E, (21.1) 


where {—5W} is the mechanical work done by external forces on the system 
absorbing heat. The difference Q — (—ô W) between the heat absorbed and 
the work done represents the part of the heat used in changing the internal 
state of the system. A 

The quantity Æ, representing a function of the internal state of the sys- 
tem, is called the internal energy. For a so-called cyclic process, in which the 
system after all changes comes back into its initial state, the algebraic sum 
of all heat absorbed and work done is equal to zero. This means that in a 
cyclic process the system obtains the same amount of heat from without as 
the mechanical work it gives. Whence it follows that the change in the inter- 
nal energy of a system in a cyclic process is equal to zero, $ d£ = 0. This 
equality means that the internal energy is a single-valued function of the 
state of the system. Thus, the relation (21.1) expresses the energy conserva- 
tion law. It is usually written in the form 


dE=50+5W. 


The second basic proposition of phenomenological thermodynamics, called 
the second law of thermodynamics, also represents a generalization of the re- 
sults of numerous experimental data. The second law of thermodynamics 
says that it is impossible to draw heat systematically from a system and to 
transform it into work without some other changes occurring simultaneously 
in the system or in the bodies surrounding it. 

A machine which, drawing heat from a body, could transform it system- 
atically into work was called a perpetual motion machine of the second kind. 
It is clear that, if such a machine could be constructed, the bodies surround- 
ing us, for example oceans, could serve as a practically inexhaustible reservoir 
of work. However, all attempts to construct such a machine failed. 

Thus, the second law of thermodynamics, like the first, rested upon numer- 
ous reliable experimental data. In what follows, it will be shown how one can 
pass over from the above qualitative formulation of the second law to its 
quantitative formulation. It turned out that, based upon the mathematical 
formulation of the first and second laws of thermodynamics, it was possible 
to construct a phenomenological theory of thermal processes, called thermo- 


§21 INTERNAL ENERGY OF A MACROSCOPIC SYSTEM 99 


dynamics. All deductions of thermodynamics had the same degree of relia- 
bility as the first and second laws on which they were based, which made 
them indisputable. 

The reliability and general character of the deductions represent the most 
important merit of thermodynamical methods of investigation. Their short- 
comings lie in the fact that they do not disclose the physical, molecular basis 
of thermal processes. Hence the construction of a molecular theory of heat 
and the elucidation of the molecular basis of thermodynamical ideas ap- 
peared to be the most important stage in the development of the theory of 
heat and physics as a whole. 

At present thermodynamics and the molecular theory of thermal processes 
(statistical thermodynamics) make a unified whole. 

In what follows we shall be able to convince ourselves by concrete ex- 
amples that phenomenological and statistical thermodynamics do not con- 
tradict, but supplement each other. 

We shall base the molecular theory of the thermal properties of matter on 
the following very natural supposition: 

“The internal energy of a macroscopic body is identical with the mean 
energy a calculated according to the laws of statistical physics”. 

In the following we shall consider the thermal properties of macroscopic 
systems which contain a very large number of particles and which are in a 
state of statistical equilibrium. Since in a system containing a large number 
of particles and in a state of statistical equilibrium the mean energy € is 
practically the same as its actual energy, this assumption can be formulated 
differently: 

“The internal energy of any macroscopic body represents the energy of 
thermal motion of the molecules constituting the body”. 

It should be noted that at present this assumption is so well substantiated 
experimentally and theoretically that the term ‘‘assumption” appears to be 
inadequate. 

We do not, however, consider it superfluous to stress that the identification 
of the mean energy € of motion of the molecules with the thermodynamic 
energy Æ is the basis of our further exposition. All other statements, having a 
less obvious character, for example the statistical treatment of the second 
law of thermodynamics, which we shall analyse in later paragraphs, do not 
need any new assumptions or references to experiment for their substantia- 
tion, but appear to be a direct consequence of this unique assumption. 

In order actually to calculate the mean energy of the system we have to 
make use of the general rule of §16 (eq. (16.16)) which, as applied to the 
energy, reads 


100 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


yoy € e V8 Qe) 
Dye QUe) 


where the summation is carried out over all energy levels of the system. 
The expression for the mean energy € can be rewritten in a more com- 
pact form. That is, from the obvious identity 


E= 


on D e70 Q(e;) = g yy e 9 e; 2(E;) 
a(—e-!) a(—0-1) 


it follows that € can be written in the form 


zz mgp ne (21.2) 
a(—0-1) 36 


From formula (21.2) it follows that to find the mean energy of the system it 
is sufficient to know its partition function Z. Because of the assumption of 
the identity of the mean and thermodynamic energies of a system we shall 
always write 





B= —9—ing= 9222 | (21.3) 
a(—0-1) ð 


It follows from the formulae given that the state of a macroscopic system, 
in particular its internal energy, depends on the temperature @ of the reser- 
voir. In a state of statistical equilibrium the temperature of the system is 
equal to the temperature of its surroundings (reservoir), so that one can 
speak about the dependence of the energy of the body on its proper tem- 
perature. 

The internal energy of a macroscopic system possesses the important 
property of additivity: the energy of a complex system is equal to the sum 
of the energies of its macroscopic parts. This statement is, of course, of ap- 
proximate character. It assumes that the energy of interaction between the 
parts can be neglected. In the case of macroscopic parts it usually can be 
neglected, since it has the character of a surface energy (see, for instance, 


§65). 


§22 WORK AND PRESSURE 101 
§ 22. Work and pressure 


The state of a body in statistical equilibrium depends on the external con- 
ditions as well as the temperature, which are determined by the magnitude of 
external fields acting on the body. 

According to what was said at the beginning of §8, the volume of a body 
is also determined by fields of force acting on the surface of the body; the 
walls of the container represent such a field of force as shown in fig. III.S. 

External conditions can be characterized by certain quantities called 
external parameters. These external parameters of a system are determined 
by the fields acting on the body or by the position of bodies surrounding it. 

Imagine, for example, that our system is a gas confined in a container with 
a movable cover (a piston). Then the state of the system depends on the 
position of the piston. This position is an external parameter, since the value 
of the coordinate of the piston does not depend on the nature and properties 
of the system in the container. As a second example one can cite a system 
in an external field of force. If an arbitrary system is in such an external 
field of force, then its particles possess a certain potential energy. Hence the 
energy levels will depend on the properties of the field. In a uniform field 
this dependence is determined only by the position of the system in the 
field. In this case the position of the system serves as an external parameter. 

Thus, the energy levels of a system, in general, depend not only oi the 
properties of the system itself but also on the values of external parameters, 
the set of which we shall denote by X. To stress this, we shall sometimes 
write €(A). However, it should not be forgotten that the values e; depend 
not only on A but also on the properties of the system itself. 

Consider the change ôe; in the energy of the system for an infinitely small 
change 6A of its external parameters. We confine ourselves from the beginning 
to such a change of external parameters for which the probability distribution 
of different states remains unchanged. This means that, when the external 
parameters are changed, a transition of the system from one state into 
another does not occur (see below). 

We then have 


dE; 
ôe = 5). (22.1) 


The quantity d¢;/dA (taken with the opposite sign) can be considered as a 
generalized force acting on the system. We denote it by (—f;). Then (22.1) 
can be written in the form 


102 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


5€,=—f,5A . (22.2) 


To find the change in the internal energy we have to find the mean value 
of the change of each of the energy levels of the system. According to the 
rules for averaging we have 


SE=5e= 2 5¢,w,=— Li fwSd=— ASX, (22.3) 


where A stands for the mean force acting overall on the system as the param- 
eter A changes, A = (2 fiwi)wj- 

The quantity (—AdA) represents the work done on the system as the 
parameter A changes by dA. The minus sign shows that the work is done by 


external forces on the system. 
Let ôW denote the mean value of the work done on the system as ex- 


ternal parameters À change. Then we have 
(EZ), =é6w. (22.4) 


Consider, in particular, the important case where the linear dimension of 
the system, determined by the coordinate x, serves as the generalized coor- 
dinate. In this case instead of the generalized force it is convenient to in- 
troduce the pressure p, which we shall define as the mean force acting on 
1 cm2 normal to the surface of the body (the system), i.e. 


p=A/s. 
We then have 
(6£),,,= 5W = — pSdx = —psV , (22.5) 


where ôV is the change in the volume of the system. 

Such a definition of the pressure is not new; we have made use of it in the 
kinetic theory of gases. In §8 we have defined the pressure as the mean force 
acting on unit surface of the wall due to gas molecules impinging on it. Ina 
system containing alarge number of particles the true force always has a value 
very close to its mean value. It is this that justifies the introduction of the 
pressure which, to a high degree of accuracy, can be substituted for the actual 
force acting on the surface of the body. 


§23 CHANGE IN ENERGY OF A SYSTEM 103 


It is obvious that (6£),,, does not represent the total possible change in 
the energy of the system, and is not the total differential of any expression. 
Indeed, for a given structure of the system the generalized force A= fiw; 
represents a function of the external parameters À and the temperature 0. 
Hence we can write in more detail as follows: 


(5E)y,= — A0) 52. (22.6) 


The change in the energy as the parameter A changes in the range from A, to 
A2, or the work done on the system, is equal in this case to 


A2 
w= jf A(A,0) SA. 
Al 


The meaning of the integral in the above formula obviously depends on the 
path of integration, i.e. on the character of the transition from A, to Aj. In 
particular, in the case when A= V, 


V2 


sey DVT) SV . (22.7) 


Vy 


Since the pressure depends on the volume and temperature, the transition 
from a volume V} to a volume V by a different path of integration, i.e. for 
a different character of the transition from V} to V3, leads to a different 
value for the work W. 


§23.The change in the energy of a system in the general case of a 
quasi-static process 


Now consider the change in the energy of a subsystem in the more general 
case when it interacts with bodies surrounding it (the medium), exchanging 
energy with them by direct contact. 

In what follows we shall restrict ourselves to processes in which the state 
of statistical equilibrium in the system is not violated. Such processes for 
which the system can be considered to be always in a state of statistical 
equilibrium or, more precisely, during the course of which the system passes 
through a sequence of equilibrium states, will be called quasi-static or rever- 
sible processes. The question as to how much the state of the system can in 
fact change without violating the state of equilibrium, i.e. whether quasi- 
static transitions be realized in the system, will be discussed below. 


Ee NN ed eee 


104 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


Since the system is in a state of equilibrium during all the time for which 
the process occurs, the probability distribution is determined by the Gibbs 
equilibrium distribution. 

The total change in the-mean energy can be written 


ôE = (22 eo: = ( Zwier) a ( 2 ew) 5 (23.1) 


i TAN N 
where w; is the Gibbs distribution for a temperature equal to that of the 
reservoir. The latter, however, need not remain constant in the course of the 
process. 

The first term of formula (23.1) expresses, as before, the work done on 
the system. 

The second term represents that part of the change in the energy of the 
system through interaction with the medium which is not connected with 
the change of the external parameters. In other words, the second term of 
(23.1) is equal to the change in the mean energy of the system due to the 
direct energy transfer from the particles of the medium to the particles of the 
system, and which is not accompanied by a change of external fields or the 
mutual positions of the bodies. This part of the change in the energy will be 
called the amount of heat absorbed by the system, and will be denoted by 
5Q. We then have 


5E=6W+6Q. (23.2) 


Formula (23.2) represents the energy conservation law for thermal 
processes (the first law of thermodynamics). It is in this form that the energy 
conservation law was first established after Joule’s experiments. 

Statistical physics allows one to reveal the molecular meaning of the quan- 
tities entering into (23.2) and also, at least for the simplest systems, makes 
it possible to calculate them theoretically. 

In order to elucidate the molecular meaning of the amount of heat we 
shall consider an arbitrary unclosed system in which a quasi-static process 
takes place. For this quasi-static process we can write, making use of the 
definition (23.2), 





50= ( D erwi), =6E — 2 w;6e, = 6E — 27 exp Sati) DIEDE 


§23 CHANGE IN ENERGY OF A SYSTEM 105 


The second term can be transformed in the following way. We have the ob- 
vious identity * 


ô (2 el me) = -4 KG Q(e;) Se; + Ye, €i gee Q(e;) n 
from which it follows that 
27 eie Q(e;) õe; = 


=— 65 (2 A acp j? Doc UTG (23.3) 


whence, dividing (23.3) by Z, we find 








DD el? aese 08Z 80 Dv ee T Aen 

presto i t= pee et Ot el (23.4) 
Z Z Ü Z 

The first term of the right-hand side of (23.4) can be written in the form 


SZ 
-0% = 0ôlnZ. 


In the second term the expression Z7! £ e; exp (—e;/0) Q(e;) can be replaced 
by € or E. We then have 


50 =5E + 06inZ - £98 = 66(Fainz ), (23.5) 


Thus, we arrive at the following important conclusion. 

If in a macroscopic system a process takes place, in the course of which 
the system remains at all times in equilibrium with the reservoir, then the 
change in its energy can be written in the form 


5E=5W+50=-— Ada + 08( Fanz ). (23.6) 


* In the differentiation the variable quantities are ej and 0. The number of states cor- 
responding to a given energy obviously remains constant, and is characteristic of a given 
system of particles. 


eL 


+=. 


Be 


— es. 





f 


pe ey 


106 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch.4 


Formula (23.6), which has a basic significance in what follows, represents a 
general expression for the change of energy in a quasi-static process. 

As is seen from formula (23.6), the change in the energy is resolved into 
two parts: the work ôW done on the system (or by the system), and the 
amount of heat 5Q absorbed (or given up) by the system. 

The work done is connected with a change in the values of the allowed 
energy levels, which, as we have seen in the preceding paragraph [formula 
(22.1)], is due to a change in the external parameters. If, in particular, the 
system consists of individual independent particles and one can speak of the 
energies of the individual particles, then the work done is connected with 


a change in the energy of the individual particles. 
If the external parameters do not change (the work of external forces is 


equal to zero), then the energy levels of the system remain unchanged. In this 
case the energy supplied to the system from without contributes to a change 
in the probability distribution. States with a higher energy become more 
probable — the system is heated. If, for example, the system represents an 
ideal gas, then as energy is supplied the number of molecules having relatively 
large energies increases, whereas that of molecules having small energies 
decreases. In the case where tlie system gives up energy, a reverse redistri- 
bution of probabilities takes place: states with a lower energy become more 
probable — the system is cooled. 

We shall now discuss the question as to when a process can be considered 
quasi-static. 

If the external conditions in which the system is found are changed, for 
example if its volume or the fields acting on it are changed, or if it obtains 
a certain amount of energy from outside by means of direct contact, then 
the state of equilibrium in the system will be violated. If then the system is 
isolated, then in the course of time the system will necessarily come into a 
state of statistical equilibrium. Indeed, we have said that the state of a 
complex system does not depend on its initial state. Hence, if the time of 
observation is sufficiently large, then the system spends most of the time in 
a state of statistical equilibrium, independently of the state in which it was 
initially. After the lapse of a relaxation time 7 the system, which was initially 
in a non-equilibrium, highly improbable state, passes over into a more 
probable equilibrium state. The question as to how this transition will 
proceed and how much time it will take cannot, in general, be answered. 
Processes going on in the system in this case depend on the nature of the 
system and on the character of its equilibrium state. 

We shall now assume that the change of the external conditions proceeds 
sufficiently slowly. That is to say, we shall assume that an appreciable 


§23 CHANGE IN ENERGY OF A SYSTEM 107 


change of external conditions takes place for time intervals which are very 
large in comparison with the relaxation time. Then at every instant the 
system will manage to come into a state of equilibrium corresponding to the 
given external conditions. 

We shall illustrate this by a simple example. Consider a process of com- 
pression and expansion of a gas under a piston. As the piston moves it does 
work on the parts of the gas contiguous to it. The corresponding molecules 
obtain an excessive energy in comparison with the remaining bulk of gas 
molecules, and the gas becomes non-uniform. Owing to collisions between 
molecules the non-uniformity will tend to vanish, and the energy supplied 
will be distributed uniformly between all the molecules of the gas. In order 
that this process may take place, and that the gas which has been disturbed 
from an equilibrium state may come back to it, a certain time, which is the 
characteristic relaxation time for a given process, is needed. If the piston is 
displaced so slowly (for example, by very weak and rare pushes) that the 
time needed for the displacement of the piston over an appreciable distance 
is very large in comparison with the relaxation time, then all perturbations 
of the uniformity of the gas will be resolved. The gas will at all times be 
uniform, i.e. will be in a state of equilibrium. 

Analogously, in the case of heating, in the region adjacent to the source 
of heat (for example, adjacent to one of the walls of the container), a change 
in the velocity distribution of molecules will take place and the percentage 
of molecules with large velocities will increase. The non-uniformities of the 
gas will be smoothed out by collisions in a certain relaxation time. If the 
heating of the gas proceeds so slowly that an appreciable change in the tem- 
perature takes place over a time considerably larger than the relaxation time, 
then the non-uniformities will be resolved and the gas will at all times be in 
a state of equilibrium. 

Thus, the condition of the quasi-static character of a process is the condi- 
tion of its slowness. To every relaxation time there corresponds a proper rate 
of change of external conditions for which the process can be assumed to be 
quasi-static. 

It goes without saying that the quasi-static process represents a certain 
idealization of real processes which always proceed with a finite velocity. 

Every quasi-static process is a reversible process. This means that, if in 
the course of the process the system passed through a given sequence of 
equilibrium states (direct process), then it can also be brought into its initial 
state by passing through the same sequence of states in reverse (reverse 
process). For this the external conditions in which the system is found need 
only be changed in the reverse order. 





108 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


It is impossible to do this for non-quasi-static processes. In a non-quasi- 
static process the equilibrium state of the system is violated. The state of a 
non-equilibrium system is not determined by giving the external parameters 
and temperature of the system, but needs the specification of a number of 
other quantities, for example, the temperature distribution or the density 
distribution inside the system. The change of the external conditions in the 
reverse sequence will not mean that the system passes through the same 
states in the reverse order. Hence non-quasi-static processes are irreversible. 

It further goes without saying that a completely reversible process is an 
idealization. Real processes always proceed with a finite velocity and are 
accompanied by a violation of the equilibrium of the system. However, it is 
often possible, to a good enough approximation, to disregard small perturba- 
tions of the equilibrium state of the system and to consider the process, which 
is actually proceeding with a finite velocity, as a reversible process. 


§24. Entropy and the basic thermodynamic equality 


Formula (23.5) shows that for a quasi-static process the amount of heat 
absorbed or released by the system can be written in the form 


50= 050, (24.1) 


where 60 is the change in a certain function 


5o = (tmz). (24.2) 


It is obvious that 50 represents the total differential of the expression in 
the parentheses 


ol 


o=—+InZ+const, (24.3) 


where the constant is arbitrary. The function ø is called the entropy of the 
system. The physical meaning of this very important quantity will be ex- 
plained somewhat later. 

By means of the entropy the change in the energy of a system for a quasi- 
static process can be written in the form 


§24 ENTROPY AND THE BASIC THERMODYNAMIC EQUALITY 109 
8E = 060 — Add. (24.4) 
The external parameter is most frequently the volume V of the system. Then 
5E=050—péV. (24.5) 


Formula (24.4), or (24.5), expressing the change in the energy of a system 
in the most general case of a quasi-static process, is called the basic thermo- 
dynamic equality. 

We have obtained the basic thermodynamic equality in a purely statistical 
way. However, this equality, as well as the entropy defined by formula (24.1), 
were introduced historically in phenomenological thermodynamics (see below). 

The basic thermodynamic equality shows that the total change in the 
energy of a system in a quasi-static process is determined by the change in 
the external parameter 5A and the change in the entropy do. 

Indeed, from formulae (24.4) and (24.5) we find 


= f(O5 = (98 -( 2E , 
£ (E), p oS o=(3£) . CO) 


so that we can write 


ðE OE 
= — AdA =| — noel 24. 
8E =060—A (32) 50 +(5") ôd. (24.7) 


Thus, the thermodynamic internal energy of a system can be considered 
as a function of the independent variables o and A (or V). As is seen from the 
equality @=(0£/d0), and the condition @ > 0, the energy is a monotonic 
function of the entropy. 

The formula for the change in the entropy is similar in its structure to the 
formula connecting the change in the potential energy with the generalized 
coordinate in mechanics. For this reason the internal energy is called the 
thermodynamic potential with respect to the generalized coordinates o and À. 
The quantities 0 and A play the role of generalized forces. 

The quantity 5£ is a total differential, in contrast to the heat 6Q and 
work 6W, which in the general case do not represent the total differentials of 
any expressions. 

Taking the integral over the closed cycle of changes in the state of the 
system, we find 





110 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


fsE=0. 


This is quite natural, since the internal energy Æ is a single-valued state func- 
tion of the system. When the system comes back into its initial state its 
energy will also assume its initial value. 

The work W and the amount of heat Q depend not only on the state but 
also on the character of the processes taking place in the system. Hence it 
makes no sense to speak of the amount of heat in the system in a given 
state. Only the change 6Q in the heat has a meaning. 

Formula (24.1) shows that the ratio (6Q)/@ is a total differential and, 
consequently, the entropy o represents a single-valued state function of the 


system. For it the condition 


fo0= 50. 


holds. 
The temperature 0 can be considered from the mathematical point of view 


as the integrating divisor of the expression 6Q. 

In order to explain the physical meaning of the entropy, which will be 
done in the next paragraph, we have also to consider its other properties. 

According to formula (24.3), to calculate the entropy it is necessary to 
know only the partition function Z. In formula (24.3) an integration con- 
stant appears. Thus, the entropy is determined only to within an arbitrary 
constant. It is important, however, as will be shown in § 36, that this con- 
stant is indeed a constant, depending neither on the temperature of the 
system nor on any other parameter characterizing the state of the system (the 
volume, physical and chemical state of the system and so on). This constant 
can be chosen as zero, and formula (24.3) can be written in the form 


o=E+inZ. (24.8) 


We transform the above expression for the entropy, making use of the fact 
that in a state of statistical equilibrium the system will possess energies e; 
close to the mean energy € for an overwhelmingly large part of the time. 

We transform the partition function of the system, taking into account 
that only the most probable state with the energy € gives a significant con- 
tribution to it. We can write 


§25 LAW OF INCREASE OF ENTROPY 111 


Z= 27 i K€) ~e Q@) = E 26). (24.9) 


Here we have retained only the largest term in the sum over the states. 
Substituting (24.9) into (24.8), we find 


o = InQe). (24.10) 


The entropy of a macroscopic quasi-closed system turns out to be equal 
to the logarithm of the number of states corresponding to the mean energy of 
the system, i.e. to the logarithm of the number of states of the system in a 
state of statistical equilibrium. 

Thus, the entropy ø is identical with the function ø which we have in- 
troduced in §16 (formula (16.7)). A very important property of the entropy 
is its additivity. The entropy of a complex system in an equilibrium state, 
consisting of n subsystems, is equal to 


aneso = 2 nny = "270, (24.11) 
n 


The additivity of the entropy follows directly from its definition (16.7) 
or (24.3). F 

The degree of accuracy of the statement concerning the additivity of the 
entropy is the same as that of the statement concerning the additivity of the 
energy. 


§25. The law of increase of entropy 


In the preceding paragraphs we have considered a quasi-closed system in a 
state of statistical equilibrium of performing a quasi-static (reversible) process. 
We have established that the molecular (statistical) interpretation can be given 
to a number of macroscopic notions: the internal energy, work, amount of 
heat. The macroscopic quantity entropy also has a molecular interpretation. 

Formulae (24.8) and (24.10) allow one to calculate the value of the 
entropy, but do not shed any light on the meaning of this quantity. In order 
to elucidate the molecular meaning of entropy one has to consider a system 
with a simpler statistical behaviour than a quasi-closed system, i.e. a closed 
system. The simplicity of a closed system will allow us not only to study the 


112 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


properties of equilibrium systems, but also to include the treatment of non- 
equilibrium systems. 

Imagine a closed macroscopic system as a set of a large number of parts. 
Each of the parts has dimensions small in comparison with those of the 
system as a whole, but still contains a large number of particles and is a 
quasi-closed system. Since our division is quite arbitrary, it can always be 
made. Assume that all the parts of our complex system have come into a 
state of statistical equilibrium. Then for each of these one can write the ex- 
pression (24.10) for the entropy: 


6, =InQ,,(E,) » (25.1) 


where the index 7 denotes the number of the part. 

However, we shall not assume that there is statistical equilibrium between 
the parts of the system. For example, different parts of the system may have 
different temperatures although the proper temperature of each of the parts 
is constant. The entire closed system as a whole will be in a non-equilibrium 
state. 

We shall determine the entropy of the closed non-equilibrium system. 

According to the very meaning of the concept the entropy of a complex 
system should be assumed to be made up additively of the entropies of all 
parts constituting it, i.e. 


g= Dag (25.2) 


As we have seen above, this formula is trivial for the case of a system in which 
there is equilibrium between the parts. It represents a natural generalization 
of the notion of entropy to the case of a non-equilibrium system. 

For each of the parts of the system the equilibrium value of the entropy 
can be written from formula (24.10). We then have 


o= » 0, = 2 InQ,,(€) = EES =InQ, (25.3) 
n 


where 2 = | | Mi 2,,(€,) is the total number of states of a system consisting 


of N independent parts. 

We see that the entropy of a closed system turns out to be equal to the 
logarithm of the number of states of the system. This may not be the same as 
the logarithm of the number of states Q(€) of the entire system when it is in 


§25 LAW OF INCREASE OF ENTROPY 113 


a state of statistical equilibrium (which always holds for each of its parts or 
for the entire system in a state of statistical equilibrium). 

In the closed system considered here the microcanonical distribution 
(16.1) holds, connecting the probability of a state of the closed system with 
the number of its states Q. Expressing Q in terms of w, we find 


o=I|Inw+const. (25.4) 


Formula (25.4), representing the basis of the statistical treatment of thermo- 
dynamics, is called the Boltzmann formula. 

The Boltzmann formula connects the value of the entropy for a given state 
of a closed system with the probability of that state. The change in the 
entropy as the closed system passes from one state into another is equal to 


w2 
02 — 0, Ss age (25.5) 


where w; and w2, 0, and o% are the probabilities and entropies of the first 
and second state, respectively. 

The entropy of a quasi-closed system can be expressed in terms of the 
probability density p(p,q) entering into the classical Gibbs distribution (19.4) 
in the following way. 

The function p satisfies the normalization condition 


foan=1. 


Taking into account that p(€) has a sharp maximum at e= €, we have ap- 
proximately: 


J p42 ~ pA =Z-! 49 A) =1. 


Hence the entropy of the system, on the basis of (24.10), is equal to 


o = InQe)= ATE =- np hnZ+É. 


On the other hand, one can write 


1 1 ib G i a 
In 15 = Jel) In zi a= Joe) (nz) an =1nz +5 





114 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


Hence the entropy o of a quasi-closed system can be written in the form 





o=In—. (25.6) 


We shall need this formula in what follows. 

We shall return to the Boltzmann formula (25.4) and see how, by means of 
it, the laws of the change of state of a closed system in time can be estab- 
lished. 

Assume that the closed system was initially in a certain non-equilibrium 
state. Then wy stands for the probability of the initial non-equilibrium state. 
After the lapse of the relaxation time the system will pass over from the non- 
equilibrium state into a state of statistical equilibrium. This transition takes 
place due to a weak, but ever present, interaction between its parts. Without 
going into the question of how and in what time the equilibrium is estab- 
lished (this is a problem of physical kinetics), we can state that this transition 
occurs inevitably in any macroscopic system after the lapse of the relaxation 
time. By definition, the probability w, of the state of statistical equilibrium 
(in which the macroscopic system spends almost all its time) has the maxi- 
mum value, so that w2 > wj. From formula (25.5) it follows that the en- 
tropy of a closed system increases as the latter passes from a non-equilibrium 
state to an equilibrium state. Thus, the increase in the entropy of a closed 
system turns out to be connected with its transition from a less probable 
state into a more probable one. The entropy of a system in a state of total 
statistical equilibrium has the highest value. The result obtained can be 
formulated in the following way. 

If a closed macroscopic system is initially in a non-equilibrium state, the 
probability of this state and hence its entropy do not have the highest pos- 
sible values. 

The behaviour of the system which is most probable is that in which it will 
pass, after the lapse of the relaxation time, into its most probable state, 
whose entropy is a maximum. 

It can be shown that, on the average, this transition will be performed 
monotonically, i.e. that the system comes into a state of statistical equilib- 
rium by passing successively through a number of more and more probable 
states until it reaches the state of complete equilibrium. The entropy of the 
system increases progressively, attaining its maximum value in the most 
probable equilibrium state. Thus, the change of entropy in time proceeds as 
is shown in fig. II1.12 by the solid (not the dotted!) curve. 

Now imagine a case where a closed system is initially already in the state 


§25 LAW OF INCREASE OF ENTROPY 115 





Fig. 111.12 


of total statistical equilibrium, in which its entropy has the maximum value. 
Then in the course of a very long time, exceeding the relaxation time, the sys- 
tem will remain in its equilibrium state, and the entropy will preserve its 
maximum value. In general, it can be said that the most probable trend of 
processes in a closed macroscopic system is that in which the entropy in- 
creases Or remains constant: 


Ac>0, (25.7) 


where the inequality sign refers to processes bringing the.system nearer to a 
state of statistical equilibrium, and the equality sign refers to processes taking 
place in a system which is already in a state of equilibrium. 

We know, however, that probability predictions, as applied to macroscopic 
systems, are practically speaking of a completely reliable character. Hence, 
leaving the problem of a more complete and deeper treatment of the law of 
increase of entropy until §36, we shall take into account only the most 
probable trend of the entropy and assume that formula (25.7) does not have 
simply a likely character, but a completely reliable one. 

Then the change or the constancy of the entropy can be considered as a 
criterion of irreversibility or reversibility of the processes taking place in a 
closed system. In irreversible processes, in the course of which the system 
approaches an equilibrium state, the entropy increases, whereas in reversible 
processes it remains constant. 

As an important example of an irreversible process taking place in a closed 
system, we shall consider a process occurring when parts of the system having 
different temperatures are brought into contact. 

If two parts of the system, having temperatures 0, and 0 (for con- 
creteness we assume that 05 >, ) are in contact, then the change in the 
entropy of the closed system is equal to 


ote ee 


tl ll aa 


. 
| 
| 





116 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


002 _ ôE: ee? 
L 5E) + a7 O A 


d 


ôo =60, +605 = si 


Since the system is closed, its total energy is conserved, so that 
8E =65E,+ 6E> =0. 


Consequently, 


50= e(z: “7 )Po. (25.8) 


Formula (25.8) shows that, if 8, >80}, then it follows from the law of 
increase of entropy that 6£,; >O. This means that the first part, having a 
lower temperature, absorbs energy from the second part. In other words, 
heat always passes from a hotter body to a colder one. 


§26. The basic thermodynamic inequality 


It is natural to generalize the law of increase of entropy to the case of 
open systems. 

Such a generalization can easily be made in the case of system which are 
not closed but are thermally isolated. We shall understand thermally isolated 
systems to be those for which the interaction with surrounding bodies re- 
duces to an action due to external fields, i.e. to a change of the external 
parameters. 

A change in the external fields, as was explained in §22, can lead to a 
change in the energy levels of the system (or to a change in the energy of 
individual particles in the case of gases), but it does not lead to any change 
in the probability distribution. Hence the transitions from less probable 
states to more probable ones in a thermally isolated system obey the same 
laws as in a closed system. The results of the preceding section can be applied 
directly to the case of systems which are thermally isolated but not closed, 
by writing the law of increase of entropy for these as 


S0>0. (26.1) 


§ 26 BASIC THERMODYNAMIC INEQUALITY 117 


In the general case of open systems exchanging energy in an arbitrary way 
with bodies surrounding them the following inequality can be written: 


o>? (26.2) 


For quasi-static processes this goes over into the equality (24.1), while in the 
transition to a thermally isolated system it goes over into the inequality 
(26.1). 

The inequality (26.2) means physically that in irreversible processes the 
entropy of the system increases by an amount larger than (8@)/0 which is 
the amount by which the entropy increases because the system absorbs heat. 
This excess of the entropy increase in comparison with (6Q)/@ is associated 
with the transition to a more probable state, i.e. with the approach to equilib- 
rium. 

Combining (26.2) with the basic thermodynamic equality, one can write 
the following basic thermodynamic inequality for tne general case of arbitrary 
processes in open systems: 


8E <050+5W, (26.3) 


where the equality sign refers to reversible processes, while the inequality 
sign refers to irreversible processes. 

The basic thermodynamic inequality unifies the concepts of the energy 
conservation law and the entropy increase law, and can be called the unified 
form of the expression of the first and second laws of thermodynamics. 

The relations obtained allow us to point the way to determine the scale of 
the statistical and absolute temperature. 

The statistical temperature is measured in ergs, whereas in practice, for 
the measurement of the temperature, use is made of another system of units: 
degrees. 

Of great importance is the absolute scale of temperature, in which the 
temperature is measured from an absolute zero and which is, essentially, 
identical with the statistical temperature. 

To establish the relationship between the statistical and absolute temper- 
atures it is only necessary to find the numerical expression of the units of 
energy in terms of degrees. Thus, one can write 


C= Kell, (26.4) 


118 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


where the constant k represents the conversion factor relating ergs to degrees. 
It appears to be a universal constant the numerical value of which can be 


obtained only from experiment. 
The quantity k is called Boltzmann’s constant. Measurements (for ex- 


ample, of the thermal capacity of gases) show that k= 1.38 X 10716 erg 
degl. 

Making use of the absolute scale of temperatures and introducing the en- 
tropy expressed in erg deg™!, S= ko, we can rewrite formulae (25.4), (24.3) 
and (24.5) in the form 

S=kIn w+ const =Z+k InZ+const, (26.5) 


8E = T5S — pV . (26.6) 


§27. The maximum work to be obtained from thermal processes. The impos- 
sibility of constructing a perpetual motion machine of the second kind. 
Phenomenological definition of the entropy 


We can now consider problems whose study was, historically, a stimulus 
to the development of phenomenological thermodynamics. The objective is 
to calculate the amount of useful work which can be obtained in changing 
the internal energy of a system. In thermodynamics we call devices for ob- 
taining work, heat engines. 

All heat engines can be divided into two types. Engines of the first type 
produce useful work by a sequence of closed cycles. These include steam- 
engines, steam turbines and gas turbines, compressors, internal-combustion 
engines etc. At the end of each cycle the engine comes back to the initial 
State. 

Hence the engine itself serves as a kind of transmission mechanism con- 
ducive to the transformation of the internal energy of the work substance 
into work. 

Engines of the second type perform non-cyclic processes and produce 
useful work. Such an engine — a system which is initially in a non-equilibrium 
state — comes into an equilibrium state. The transition to the equilibrium 
state yields useful work. 

All engines of a once-only action belong to this type. In such engines the 
useful work is most often obtained from chemical reactions going on in the 
system. As an example we can cite galvanic cells, rockets etc. 


§27 MAXIMUM WORK OBTAINED FROM THERMAL PROCESSES 119 


We shall first consider heat engines performing closed cycles (those of the 
first type). 

Within the framework of this book we cannot, of course, study in detail 
the theory of the action of particular heat engines. This is a problem of 
technical thermodynamics. We shall confine ourselves only to the explana- 
tion of the principal aspect of the problem. 

That is, we shall consider, first of all, the question as to whether the 
internal energy — the energy of thermal motion of the particles constituting 
the body — can be transformed directly into useful work. 

We shall show that the existence of such an engine, which we have called 
above a perpetual-motion machine of the second kind, contradicts the law 
of increase of entropy and is hence impossible. For this we shall consider an 
arbitrary thermally isolated system having an initial energy Æg, entropy So 
and external parameters Xg. We assume that the system, remaining thermally 
isolated, passes over in a non-quasi-static way on account of a change in the 
external parameters into a new state with an energy E’, entropy S' and 
parameters A’. Thereafter the system returns in a quasi-static way to the state 
with external parameters Ag. However, in the final state it will have entropy 
S' and energy £’ differing from the initial values of the entropy and energy. 

According to the law of increase of entropy, S' >So for a thermally 
isolated system. But from the condition (0£/0S), = T> 0, expressing the 
monotonic behaviour of the energy as a function of the entropy, it follows 
that an increase in the internal energy of the body, i.e. E’ > Ep corresponds 
to an increase in the entropy. 

In the course of the process considered the energy of the thermally 
isolated body must increase. An increase in the energy can take place only 
at the expense of work done on the system by external bodies. 

Thus, from the law of increase of entropy it follows that the system 
considered not only cannot serve as a source of useful work but, that on the 
contrary, work must be done on the system when it goes over irreversibly 
into a state with a new energy. 

Hence we can state that the law of increase of entropy is equivalent to 
the proposition of the impossibility of creating a perpetual-motion machine 
of the second kind. 

Of course, the reverse statement is also valid: from the impossibility of 
creating a perpetual-motion machine of the second kind there follows un- 
ambiguously the existence of a monotonically increasing state function, the 
entropy, for a closed system (see below). 

Proceeding from this principle, one could introduce into thermodynamics 
the entropy and the law of its increase as a quantitative expression of the 





120 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch.4 


second law of thermodynamics. It is in just this way that thermodynamics 
developed, and this historical order of developing thermodynamics is also 
preserved in contemporary text-books. Thus, the historical development of 
thermodynamics was in the reverse of the order in which we develop the 
material in this book. 

Returning to the consideration of the problem of obtaining work, we note 
that to obtain useful work it is necessary to have at least two bodies at dif- 
ferent temperatures 7; and 7, i.e. a system of bodies which are not in 


equilibrium. 
Before calculating the work obtained, we shall show that the maximum 


work is obtained in a reversible (quasi-static) process. 
Let the change in the energy in the general case be equal to 


6£=50+5W. 


For a reversible process the same change in the energy can be written in the 
form 


5E=Té&S+5W'. 
Subtracting, we find 

bW' —5W=T& — 60. 
But 74S > Q, so that 


éW'—-d5w>0, 


or 
wW >ôW. (27.1) 


The maximum work is obtained in a reversible (quasi-static) transition. To 
calculate this work we first note that the establishment of thermal contact 
between bodies at different temperatures leads to an irreversible heat transfer 
and to no useful work. 

Hence the working engine must contain three elements: 

(1) a system with a temperature Ts (a hot reservoir), 
(2) an auxiliary system by means of which the energy is transferred from a 
warmer body to colder one without direct contact between them (a working 


substance), 
(3) a system with a temperature T} < T) (a cold reservoir). 


§27 MAXIMUM WORK OBTAINED FROM THERMAL PROCESSES 121 


An energy ôE, = 6Q, is transferred from the hot reservoir to the working 
substance in a reversible way. For this it is necessary that the temperature of 
the hot reservoir should be equal to that of the working substance during the 
course of the entire process of heat transfer (isothermal process). Then 
6Q> = T25S>- 

A part of the energy obtained from the hot reservoir must be transferred 
to the cold reservoir (otherwise we would obtain a perpetual-motion machine 
of the second kind), while another part of the energy is transformed into 
useful work. The energy balance reads: 


50> +80, =—ôW. 


In order that irreversible processes may be avoided the heat 6Q, must be 
transferred to the cold reservoir in an isothermal way at the temperature 7, 
of the cold reservoir. Hence the working substance must pass over from a 
temperature T3 to a temperature 7, in a thermally isolated and reversible 
way, and thereupon it must transfer to the cold reservoir an amount of heat 
6Q, in a quasi-static way. In order that the process may be repeated the 
working substance must return in a thermally isolated way (adiabatically) 
to the temperature 7. This closed cycle is called the Carnot cycle. 

Since all processes in the system (hot reservoirtworking substancetcold 
reservoir) are reversible, the total change in the entropy is 


§S=6S, +6S,=0. 


For the work done one can write that 
—ôW = ôQ2 + 6Q, = T2882 +7 )5S; 


T= IN ty = Hh) 
= (T—-T,) 6S> = T êQ = T bE A 





The ratio of the work done to the amount of energy absorbed from the 
hot reservoir is called the efficiency 7. In our case 
To i 
Anii BQ) i Bie 





122 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


It is clear from the nature of the above derivation that the efficiency obtained 
has the maximum possible value. If irreversible processes take place in the 
heat engine then we will always have n < Nmax- 

Thus, the maximum efficiency is possessed by a reversible engine working 
according to the closed Carnot cycle. The value of the efficiency does not 
depend on the nature of the working substance and is determined solely by 
the ratio of the temperature drop Ta — T} to the temperature of the hot 
reservoir T3. 

We shall now reproduce briefly the train of reasoning which led to the 
introduction of the notion of entropy into phenomenological thermody- 
namics. It was, to a certain degree, the reverse of ours. 

From the relation (27.2) it follows that 


ôQ2 6Q) 


The ratio of the amount of heat 6Q obtained at a certain temperature T to 
the value of the temperature, T~!5Q, was called by Clausius the reduced 
heat. Consequently, the algebraic sum of reduced heats for the Carnot cycle 
is equal to zero. We have obtained this result by means of formulae relating 
a change in the amount of heat to the absolute temperature and entropy. 
However, for an ideal gas the amount of heat absorbed or released in an 
isothermal process and the change in the temperature in an adiabatic process 
can be found directly. This allows one to find the efficiency of a reversible 
engine operating according to the Carnot cycle with an ideal gas as the work- 
ing substance, and the result agrees, of course, with (27.2). Thus, formulae 
(27.2) and (27.3) can also be obtained without introducing the entropy. 

Now let us consider a heat engine performing an arbitrary reversible cycle. 
This cycle can be divided into an infinitely large number of infinitely small 
Carnot cycles. Summing the relation (27.3) over all the elementary cycles, 
we can write 


$220. (27.4) 


Hence it follows that the quantity 7-!dQ represents the total differential of 
a certain state function S of the system. For a cyclic process the total change 
in the function S is equal to zero. The function S is called the entropy. 

The law of the constancy of the entropy for a reversible process in a 
closed system (which does not absorb or release any heat) follows directly 


§27 MAXIMUM WORK OBTAINED FROM THERMAL PROCESSES 123 

from the definition of entropy. To find the change ‘in the entropy in an 

irreversible process, the transition from a certain initial state A into a final 

state B is considered for two modes, a reversible and an irreversible one. The 

change in the internal energy, which is also a state function, is equal to 
ôE =E; — Ep 

and does not depend on the mode of the transition. 


The change in the entropy for a reversible mode is connected with the 
heat SQ absorbed, by the relation ôS = T-1580. Hence for a reversible change 


SE = TSS +W ey = TES — |8 Wvl 5 (27.5) 
where |ô Weyl is the work done by the system on external bodies. For a transi- 
tion in an irreversible mode the work produced will be smaller than in the 
case of a reversible mode (otherwise the efficiency of the irreversible closed 


cycle would be greater than that of the Carnot cycle). Hence, taking into 
account that 


8E = 8Q + ôWirev =5O — lô Wirrevl > (27.6) 
and subtracting (27.5) from (27.6), we find 

TES = 8Q + |8 W evl — I8 Winey! > 
or, since | Weyl > 5 Wirey! - 

T5S>6Q. (27.7) 


Hence it follows that the change in the entropy for a transition A > B in an 
irreversible mode is 


6Q 
Sar (27.8) 


In a closed system an irreversible transition is accompanied by an increase in 
the entropy: 


5S>0. (27.9) 


124 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


Thus, proceeding from the fact of the impossibility of creating a perpetual- 
motion machine of the second kind, we arrive at the condition (27.3). The 
law of increase of entropy is obtained from (27.3) as a direct consequence, 
and is hence equivalent to the initial premise. 


§28.The maximum work in non-cyclic processes and the thermodynamic 
potentials 


We shall now consider the problem of the maximum work which can be 
done by a system performing a non-cyclic process (a heat engine of the second 
kind). Let a certain system (which we shall call the basic system) be in a hot 
reservoir in which a constant temperature To and pressure Po are maintained. 
There is an interaction between the system and the reservoir, an exchange of 
heat and work. In addition to the basic system and the reservoir let there also 
be a thermally isolated body on which the system can do mechanical work. 
We shall call this body the object of the work, and the work done on it the 
useful work. 

Let the basic system pass from an initial state into a certain final state, 
doing useful work (—ô W). 

If the system did not interact with the reservoir the useful work (—6 W) 
would be equal to the change in its energy ôE. However, the continuous 
interaction of the basic system with the reservoir as it does the useful work 
changes this relationship in an important way. That is, while the basic system 
does useful work the reservoir can in its turn exchange energy with the 
system. 

Therefore the energy balance in the closed system (basic system+reservoir+ 
object of the work) should be written in the form 


5E +5Ey=6W, (28.1) 


where 5£> is the change in the energy of the reservoir, which can be written 
as 


5E) = 80, + Wo - 


Here 6Q, is the heat transferred by the reservoir to the basic system, and 
5Wo is the work done by the reservoir on the system. The size of the reser- 
voir is so large that for any interaction with the system an infinitely slow 
quasi-static process takes place in it. The reservoir is in an equilibrium state 


§28 MAXIMUM WORK IN NON-CYCLIC PROCESSES 125 
with temperature Tg and pressure pg, which is not violated for any process 
in the basic system. 

Hence for the reservoir one can write that 


ôE = ToS o — Po Vo - (28.2) 


Since the volume of a closed system (a basic system+reservoir) must remain 
constant, we have 


5V,+5V=0. (28.3) 
From (28.1), (28.2) and (28.3) we find 
-58W > 5E — Ty8Sy + PoV - (28.4) 


We write the law of increase of entropy in the closed system (basic system+ 
reservoir) in the form 


ôS +S) 20. 
Replacing 6S in (28.4) by 5S, we find 

—ôW >ôE — ToS + ppd V=5R , (28.5) 
where the quantity R is equal to 

R=E+p V- ToS. (28.6) 
From what was said in the preceding paragraph, the maximum amount of use- 
ful work can be done on the object of the work in a reversible process, in the 
given case a reversible process in the system, since any process in the reser- 
voir is always a reversible one. In this case in (28.4) only the equality sign 
should be retained, and we arrive at the relation 

(6W) max =— {5E+po5 V-T) dS} =— ôR. (28.7) 

Thus, the maximum useful work is, in its absolute value, equal to the de- 


crease in R. R involves quantities referring to the system (namely: £, V,S) as 
well as those referring to the reservoir Po To). 


126 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


A concrete expression for ô Wmax containing only the characteristic param- 
eters of the system can be obtained only for processes of a special type taking 
place in the system. 

Assume that a system performs an isothermal process T= Ty = const and 
that the volume of the system does not change. In the case of a system in an 
external field of force at given T and V the state of the system is completely 
determined. If, however, the system is in an external field or is non-homo- 
geneous, for example represents a mixture of reactants, then for given T and 
V the state of the system can change. Then the work obtained is 


—6W > 5(E-TS) = ôF, (28.8) 
where 
F=E-TS. (28.9) 


The quantity F, which is a measure of the work that can be obtained in an 
isothermal-isovolumic process taking place in a system interacting with a 
reservoir is called the free energy of the system. We see that only a part of the 
internal energy of the system can be used to obtain useful work. A part 
equal to TS and called the bound energy remains in the system. 

Another important case is a process taking place at a constant temper- 
ature T= Tọ and a constant pressure p = py. In this case 


—6W > 5(E+pV—TS) =6G , (28.10) 
where 
G=EtpV-TS, (28.11) 


is called the Gibbs thermodynamic potential. 

The thermodynamic potential is a measure of the work done in an iso- 
thermal-isobaric process, just as the free energy is a measure of the work 
done in an isothermal-isovolumic process and the internal energy is a meas- 
ure of the work in a thermally insulated system. 

It can easily be shown that the expressions obtained are valid not only 
at constant temperature and pressure or volume but also in the case where 
the equalities T=) and p=p, or T= To and AV=0 hold only in the 
initial and final state of the system. Indeed, for example, for Tinit = Thin = Ty 
and Vinit = Van we have 


§29 PROPERTIES OF THERMODYNAMIC POTENTIALS 127 


—AW = (E~T, S + poV fin) (Œ — ToS + PoVini) = AF - 


§29. Properties of the thermodynamic potentials 


Consider the case when the work —ô W done by a system in contact with 
a reservoir is equal to.zero. 


5R = 5(E-TyStpyV) <0. (29.1) 


The equality sign refers to reversible processes, while the inequality sign re- 
fers to irreversible processes. The quantity R does not increase in any pro- 
cesses taking place in a system interacting with its reservoir. 

For particular cases expression (29.1) is simplified. 

For a closed system 6£ = 0 and ôV = Ọ, so that (29.1) goes over into the 
previous relation: 


ös >o. (29.2) 


Other important cases are the isothermal-isovolumic process and the iso- 
thermal-isobaric process taking place in a system when the temperature or 
pressure of the system are equal to the corresponding quantities for the reser- 
voir. In the first case T= Tọ and 6V = 0, so that the inequality 


5(E-T)S) =6F<0. (29.3) 
holds. In the second case T = To and p =Po- Then 
5(E-T,S + poV)=5GS<0. (29.4) 


Thus, in an irreversible isothermal-isovolumic process taking place in a system 
interacting with the reservoirs its free energy decreases. In a reversible iso- 
thermal-isovolumic process the free energy remains constant. The free energy 
is an analogue of the entropy and, like the entropy, serves as a criterion of the 
reversibility or irreversibility of a process. 

If, for example, a substance dissolves isothermally in a considerable volume 
of a solvent, then the temperature and volume of the system remain constant. 
The free energy of the solution will be smaller than that of the solvent and 
the substance dissolved, so that the process is irreversible. 





128 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


Analogous properties are possessed by the thermodynamic potential for 
the isothermal-isobaric process. Isothermal-isobaric processes are in practice 
encountered rather frequently, because from the experimental point of view 
it is always easier to realize conditions for maintaining a constant pressure 
than those for maintaining a constant volume. For example, in the case of 
chemical reactions it is much simpler to preserve a constant pressure in the 
reaction vessel than to maintain a constant volume of the mixture of re- 
agents. 

The free energy and the thermodynamic potential play a fundamental role 
in thermodynamics. From the inequalities (29.3) and (29.4) it is seen that 
they replace the entropy in the case of open systems, while from (28.8) and 
(28.10) it follows that they are at the same time analogues of the internal 
energy. 

Let us write the expressions for the changes in the free energy and the 
thermodynamic potential in a reversible process. In the general case formula 


(26.6) has the form 
6E=TS5S — péV. (29.5) 


If we subtract from it ô(TS), then according to the definition of the free 
energy, we find 


F = —S5T — pôv. (29.6) 


Thus, the free energy is a function of the variables T and V (or A). From 
formula (29.6) we obtain 


(E 
s=-(3F) (29.7) 
p=- (35), (29.8) 


These formulae are analogous to formulae (24.6). 

Hence the free energy is a potential with respect to the variables T, V and 
A. The quantities S, p and A, obtained from F by differentiation, play the 
role of generalized forces. 

Formula (29.8) is particularly important. It determines the dependence of 
the pressure on the volume and temperature, i.e. it represents an equation of 
state. 


§29 PROPERTIES OF THERMODYNAMIC POTENTIALS 129 


Adding the total differential ô(p V) to (29.6) and taking into account the 
definition of the thermodynamic potential, we have 


dG = 5(E—TS+pV) = —SôT + V 6p. (29.9) 


Thus, the Gibbs thermodynamic potential is a potential with respect to the 
variables T and p. The quantities S and V play the role of generalized forces: 


__(26 
s= o i (29.10) 


E on 


Since in practice it is most convenient to change or maintain temperature 
and pressure constant, the Gibbs thermodynamic potential is used especially 
often and is sometimes called the basic potential. 

The quantity 


H=E+pV (29.12) 


called the enthalpy is the potential with respect to the pair of variables p 
and S. From the enthalpy it is easy to obtain 


b6H=T6S + V 6p. (29.13) 
Hence 
_ (dH 
T= (a ; (29.14) 
ðH 
V=| =]. 29.15 
( op ) < ) 


If the state of the system depends on other external parameters A, besides 


the volume, then formulae (29.6), (29.9) and (29.13) can be generalized and 
written in the form 


5F =-—S8T — p&V — Add, (29.16) 


130 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 
5G = -S6T + Vép — Addr, (29.17) 
5H = TSS + V5p — Abr, (29.18) 


and, correspondingly, 


=, .((G2\" 955 ((CG) se (ee 
A=-(F) 7 (sey (See Nee 


As will be seen from what follows, thermodynamic potentials and their 
derivatives completely determine the thermodynamic behaviour of an ar- 
bitrary system. We shall consider below the methods of determining the ther- 
modynamic potentials theoretically and experimentally. However, it is 
necessary to obtain beforehand a number of thermodynamic relations con- 
necting thermodynamic potentials and their derivatives with each other and 
with directly measured quantities. 


§30. Some thermodynamic relations 


The heat capacity at a constant volume Cy and the heat capacity at a 
constant pressure Cp, which are defined by the relations 


0E 
cy =(34) (30.1) 
_( 0H 
Cp o) (30.2) 
play a very important role in thermodynamics. Making use of formula (26.6), 
we find 
cy=(32), r(35), (30.3) 
Analogously, 
-( 22) -r22 
c (29), =7( 8). (204 


§30 SOME THERMODYNAMIC RELATIONS 131 


From the definitions of the heat capacities it is clear that they are additive 
quantities. It is usually convenient to make use of molar heat capacities re- 
lated to 1 gram-mole of a substance. In what follows, if not specified other- 
wise, we shall make use of molar heat capacities. The heat capacities repre- 
sent thermodynamic characteristics of a substance which can be measured 
directly. 

Another important relation is formula (29.8). Since the free energy is a 
function of the independent variables T and V, formula (29.8) can be written 
in the form 


p=-( 55) fer vy). (30.5) 


It determines the dependence of the pressure on the temperature and volume, 
i.e. it represents the equation of state of the body. 

We now differentiate formulae (24.6), (29.7), (29.10) and (29.14) for the 
second time, forming mixed second derivatives. We have, obviously, 














(er), z was) sve AEN (30.6) 
and, analogously, 

(ar), F a e (30.7) 

lo T a -( alk (30.8) 

( 3 I. x Sa -( 55 J i (30.9) 


Formulae (30.6)—(30.9) are called Maxwell’s relations. 

The second and third of these relations are particularly important. They 
relate the derivatives of the entropy to the quantities (3p/ðT)y and (0V/07),, 
which can be measured directly. 

The thermodynamic potentials £ and F, and H and G, are not independent 
of each other. It is easy to establish the relations between them if use is made 


of their definitions and the definition of the entropy. Thus, from (28.9) and 
(29.7) we find 


132 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


z = OF 
E=F+TS=F ileal : (30.10) 
Analogously, 
x ðG 
n=0-r(38), : (30.11) 


Formulae (30.10) and (30.11) are called the Gibbs—Helmholtz equations. 
The Gibbs—Helmholtz equations can also be written in the forms 


aa) = A (30.13) 


If the dependences of the energy and enthalpy on the temperature are known, 
the integration of the Gibbs—Helmholtz equations allows one to find the de- 
pendence of the free energy and thermodynamic potential on the temper- 
ature: 


F=-r fart const: T, (30.14) 


G=-T fF aT + const: T. (30.15) 


§31. Methods for the transformation of thermodynamic quantities 


In thermodynamics one often has to carry out transformations of thermo- 
dynamic quantities, for example, the transformations of variables or the re- 
placement of certain quantities maintained constant in the course of a process 
by other quantities. Such transformations must be carried out according to 
the general rules for the substitution of variables wien differentiating with 
respect. to several variables. We shall present here one of the methods for such 

| transformations *. 


| * For footnot «see next page. 


§31 TRANSFORMATION OF THERMODYNAMIC QUANTITIES 133 


Let three variable quantities x, y, z be given such that each of them can be 
considered to be a single-valued function of the other two variables, i.e. 


z=2(x,)), y = y(x,z) x =x(y,z). 


We shall find the relation between the derivatives (0z/dx),, and (ðz/ðy)y- For 
this we write the obvious equations ‘ 


dee») =(%) ax (2) dy, ais 
y YI x : 


_ (0x N Ox 
ax = (55), 5), 


Substituting dx from the lower equation into the upper one,. we have 


_(2) (ax) 4, (a2) (a Oz) fey 2 
g E dy ne) E ce *( d ME 
_| (az ax) (=) | 
=| |= =] H =Z dy + dz . 3E2 
(Es AD te 


Since dy is an arbitrary infinitesimal quantity, it is necessary for the fulfil- 
ment of (31.2) that 


OROS 


Whence the relation sought follows: 


(0z/dx),, x a) 
(az/oy), (2 Ate 


Consider further the case when there are four quantities x, y, z, t, each 
pair of the quantities being completely determined if the other pair is de- 
fined, i.e. 


(31.3) 


* For another method, based on the use of the properties of Jacobians, see 


L.D.Landau and E.M.Lifshitz, Course of theoretical physics, Vol. 5, Statistical Physics 
(Pergamon, London, 1958). 


i 





134 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


t= t(x, y) = 10,z) = t(x,z) 


and so on. 
Writing ¢ as a function of the pair of variables x and y, we have 
ar=(24) ax +(24) dy . (31.4) 
Ox y oy] x 


The same change in the quantity ż as a function of y and z can be written in 
the form 


= (2) ; 2) 
dt (#) o (2 pe (31.5) 


Substituting the expression for dx from (31.1) into (31.4), we get 
or ox ot ðt ox 
a-a), (5). *(a5),] © (3), (3), = 
Ox y ay), ay jy Ox y ðz y 
= ar) E (34) | (2) 
=/(— =|) sHl== dy +| =} dz. (31.6 
(ë Le 5 ay) x oz ly ) 


Comparing (31.5) and (31.6), we find 


ðt ar (4) (2) 
=>] =(=) +2] |] . Biy, 
(5), (5), ax y ay), ( ) 
We shall present several examples of the use of the relations (31.3) and 
(31.7). 
1. Find the relation between the derivatives (0S/0V)7 and (0S/07T),,, and 


between the derivatives (0S/dp)7 and (05/07), 
According to formula (31.3) we have 


(8S/8V) 7 _ ( ar) . 
(as/eT), — \a 7s 
(S/ap)r__ (22) 
(as/eT), ~ \a)s° 


(31.8) 


§31 TRANSFORMATION OF THERMODYNAMIC QUANTITIES 135 


2. The following quantities are called thermal coefficients: 


a= L ay the coefficient of thermal expansion , 
V\ OT p 

B= (æ) the thermal coefficient of pressure 
p oT V ; 


__1(av ss 
Kr =— vat the isothermal compressibility.. 
Find the relation between them. 

From the definitions of a and 6 and (31.3) it follows that 


a (8V/0T), -(2 


kp (@V/ap)r oF) = 8. (31.9) 





3. Find the ratio of the adiabatic compressibility to the isothermal compres- 
sibility, expressing it in terms of heat capacities. 
The adiabatic compressibility is defined as 


nai fav 
eee, tari (31.10) 


Analogously, the isothermal compressibility is 


_ fav 
r=- pl) (81.11) 


According to formula (31.3) we find 





(34) ws (05/0p)y | aVv\ _ (87/dp) y 
ðp js — (AS/8V), ’ (eae (T/V); ` 


Dividing the first equality by the second one, we have 


xs _ (@V/ðp)s _ ƏS/əp)v (ƏT/ƏVp _ @S/ƏT)y _ Cy ae 
kr @V/ðp)r (T/ðp)y OSV OST Cp 








136 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


4. Find the relation between (0H/dp)7 and (0H/8T),,. 
According to formula (31.3) we have 


(0H/dp) 
om (35) k (31.13) 
H/T) — a) y 
5. Find the relation between the heat capacities Cp = T(0S/8T),, and 
Cy = T(S/8T) y. 


Since the relation between four quantities S, T, p and V is to be estab- 
lished, use should be made of formula (31.7). It gives 


as) (3 (8) (=) 

aT), \aT}y  \aV)7 \T)p * 

From the Maxwell relation (30.7) and the definition of the heat capacities 
we find 





s ap) (av 
Ger aor) en) (31.14) 
By means of (31.9) we can write 
2 
T(aV/9T);, a2TV 
“a= Gao yr kr : (31.15) 


6. Find the relation between the isothermal compressibility Ky = 
—V-1(əV/ðp)r and the adiabatic compressibility K ç = —V-1 (Ə V/ðp)ş- 
Making use of (31.7), we have 


(a)r (a) (3s), h 


By means of formula (30.8) and the equality 


(37), -(5r), G7), 


we obtain 


§32 DETERMINATION OF THERMODYNAMIC QUANTITIES 137 


(3) -(3¥) E (32) =(3¢) _ TV202 
ap jr p]s Cy oT p op js Cp d 





or 


2 
kp = ks +208 : (31.16) 
p 





§32. The determination of thermodynamic quantities by the methods of 
statistical physics 


For the determination of the thermodynamic quantities which we have 

introduced there are two possibilities: 
(1) to calculate them by the methods of statistical physics, 
(2) to find them on the basis of certain thermal measurements. 

We shall begin with the analysis of methods of calculating thermodynamic 
quantities. 

As can be seen from formula (21.3), the internal energy of any body can 
be found if its partition function Z is known. According to (24.8), it is also 
necessary to calculate the partition function to find the entropy. 

Substituting the value of S using (24.8) into the definition of the free 
energy, we have 


F=E£ —-TS=—-kT\nZ. (32.1) 
The expression of F in terms of Z turns out to be particularly simple. 


In addition we write an explicit expression for the pressure. According to 
(22.5) and (22.3) we can write 


-pV = 21 (wj5€)y;> 


hence 


dE; ; 
amy z ae XUE;) . (32.2) 


We have made use of the fact that the probability distribution remains un- 
changed, and have put in its explicit expression. Formula (32.2) can be re- 


138 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


written in a standard form, expressing it in terms of the partition function Z. 
For this we note that 


MW © —ei/Ə R) -e;/0 ðEi 
2 o DO. Dod 28 ra) (32.3) 


and, consequently, 


aiInZ_10zZ_ 1 9; ejo 5 
IY Zar a ot ape 2XeE;) . (32.4) 


Comparing (32.4) and (32.2) we see that the pressure can be written in the 
form 


OunZ ey olnZ 
-zy 74T ay (32.5) 





p=90 


Comparing the expression (32.5) with the expression (32.1) for the free 
energy, we find, in correspondence with (29.8), 


Ps OF 
(E) z 


From formula (32.6) it follows that, knowing the partition function Z of 
the system, one can find the equation of state. Indeed, since F=f(V,T), 
formula (32.6) establishes the relation between the pressure, volume and 
temperature of the system. 

From the method of proof it can be seen that the mean value of the 
derivative de;/0V, i.e. de;/AV, cannot be identified with the derivative of the 
mean energy with respect to the volume de/dV, i.e. with d£/AV. The latter 
is by virtue of (30.10) and (29.8) obviously equal to 





OE _ 0 OF )\__ ðp 
aV a lr-riE) a p 


The Gibbs thermodynamic potential G will play an important role in 
further developments. We shall find its statistical expression. From the defini- 
tion of the thermodynamic potential it follows that, in contrast to the free 
energy, it is a function of the pressure and not of the volume. In other words, 
G is a function of the generalized force A =p and not of the generalized 


§32 DETERMINATION OF THERMODYNAMIC QUANTITIES 139 


coordinate A= V. It is with this fact that the important role of G in thermo- 
dynamics is associated: in practice it is simpler to maintain a constant value 
of the pressure and other generalized forces than to maintain a constant value 
of the volume and corresponding generalized coordinates. 

To obtain a statistical expression for G is is necessary to find the depend- 
ence of the partition function Z on the generalized force (the pressure p) as 
an independent varaible, i.e. 


Z=Z(Tp). 


In the course of the foregoing discussion we have assumed the energy levels 
of the system to depend on the parameter A, i.e. €;= e€(X). We shall now 
assume the force A to be a variable, and A to be a function of A. 

Consider as an example an ideal gas in a container having a movable wall. 
If the independent variable is an external parameter (the volume of the 
container), then the system is the gas in the container. But if the independent 
variable is an acting force (an external pressure) then the movable wall should 
be included as a part of the system. 

Thus, the subsystem is formed by N gas molecules and the movable wall 
of the container, so that the subsystem has altogether 3M + 1 degrees of free- 
dom. Its state is characterized by the coordinates and momenta of all the 
molecules, as well as by the position and momentum of the movable wall. 

The movable wall is acted upon by a pressure p. A change in the pressure 
will lead to a change in the volume of the system. The latter in its turn leads 
to a shift of the energy levels of the system. We write the energy of the system 
(a gas + a movable wall) in the form 


Eip) = €; + (EkintEpot) , 


where ej is the energy of the gas, and (EkintEpot) is the energy of the wall. 

To find the potential energy Epot of the wall we note that the work 
produced on the system as the external parameter p changes by an amount 
6p is equal to 


ôW, =—V op. 
Hence for Epot we can write 


po 


Epot = Vp - 


140 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


The kinetic energy of the thermal motion of the wall €,;,, can be disregarded 
in comparison with the kinetic energy of the gas molecules (since the number 
of the latter is very large). Hence finally we have 


e(p)=e;+pV. 


The partition function of the system has the form 
e;tpV 
z.T)= { 27 exp(— TE Jove) av, (32.7) 
l 


where the summation is carried out over all levels of the system (the value of 
€; depends on V), while the integration is carried out over the entire volume 
of the system. In analogy with (32.1) we can write 


= —kT InZ(p,T) . (32.8) 


Formula (32.8) shows that the logarithm of the partition function repre- 
sents the free energy in the broad sense: F in the case of the variable V, and, 
G in the case of the variable p. Since neither the concrete nature of the system 
nor the character of the generalized force have figured in the foregoing 
reasoning, the expression obtained for G remains valid for any system and for 


any generalized force. 
From formula (32.8) one can find the mean volume of the system: 


_(2G) _ a InZ(p,T) 
V (32) igs (32.9) 





The relations obtained allow one to express thermodynamic functions 
— the internal energy of a body, its free energy, entropy and pressure — 
directly in terms of the partition function Z. The value of the partition 
function is determined by the molecular properties of the system — its 
possible energy states — as well as by the temperature 7 and volume or pres- 
sure. Thus, we arrive at the important conclusion: 

Statistical physics allows one to find the values of thermodynamic quanti- 
ties in a purely mathematical way, provided the energy levels of the system 
and the statistical weights of the corresponding states are known. 

However, the role and significance of statistical physics do not reduce to 
this very important but nevertheless partial result. Statistical physics allows 
one to attach a more profound meaning to thermodynamic quantities and 


§33 DETERMINATION OF THERMODYNAMIC QUANTITIES 141 


ideas, and reveals the physical laws underlying the thermodynamic behaviour 
of a system. Thus, we have seen that the ideas of thermodynamic energy, 
work, entropy and amount of heat were given a clear physical interpretation. 
All these concepts have been associated with molecular processes taking place 
in the system. The general statement that heat is a form of motion has found 
its mathematical expression in the formulae obtained. 

True, up to now we have not considered real physical systems and have 
not detailed the character of the molecular motion and quantum states of 
the system. The following chapters, in which the general laws of statistical 
physics will be applied to different physical systems, are devoted to the ap- 
plication of the laws found. 

The general character of our formulation of statistical laws has very im- 
portant advantages. Namely, owing to their general character, statistical laws, 
which were originally found for systems obeying the laws of classical 
mechanics, are still valid, and have undergone only small modifications as- 
sociated with the replacement of classical systems by quantum systems. The 
general character of the laws of statistical mechanics do not restrict the 
sphere of its considerations to purely thermal processes, which constitute 
the primary basis of statistics, but include also the most diverse properties of 
matter — electric, magnetic, chemical and so on. These properties of matter 
will be considered in Part IV of the book. In the meanwhile we shall con- 
fine ourselves to the study of the thermal properties of matter. 


§33.The determination of thermodynamic quantities from experimental 
data 


We shall now consider the general methods of finding thermodynamic 
quantities from experimental data. Before passing on to thermodynamic 
potentials, we shall discuss the problem of establishing a scale of absolute 
temperature. To obtain this scale it is necessary to find a relationship between 
the absolute temperature and the empirical temperature measured on an 
arbitrary scale by any thermometer. 

We find the relation between the quantities (0£/0V); and (0£/0V)s, 
making use of formula (31.7): 


(3r),.-(57), + (5), Gr), 


Making use of (24.6) and (30.7), we have 





142 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


(02) = 22A 
OEE 


Now let the temperature be measured on an arbitrary empirical scale of 
temperatures 7, for which one can write that 


t=f(T) or T TEN); 


where f is an unknown function. Then, obviously 


oN (ee ap) (22) ae 
av), \aV),’ aT by at }y dT 


Consequently, 


hence 


din _ (2P/21)y EE. (33.1) 
dt (0E/0V), +p 
Only quantities which can be measured directly and expressed in terms of 
the volume V and empirical temperature ¢ stand on the right-hand side of 
eq. (33.1). Hence the dependence of T on ¢ can be found by integrating 
(33.1) upon determining the function Y(t, V) experimentally. In fact, for the 
establishment of the absolute temperature scale, for example, at temperatures 
close to the absolute zero or at very high temperatures, use is made not only 
of thermal methods of measurement but also of magnetic or optical ones. 
We now pass on to methods of finding thermodynamic quantities. The 
determination of the enthalpy is the simplest. 
At constant pressure 


dH = TAS = C,dT . 


On integrating we have 


It 
H= | C(T)AT+H] . (33.2) 
Tı 


§33 DETERMINATION OF THERMODYNAMIC QUANTITIES 143 


In order to find H(7) experimentally it is necessary to measure the heat 
capacity Cp in the entire temperature range and to know H} at a certain 
temperature. If, as is usually the case, we are interested in the change of the 
heat content, then 


T 
AH=H(T)—H(T,)= f° C, (TAT 
Ty 


and the constant H} drops out. 

For the absolute measurement of H it is necessary to make use of the 
properties of H at absolute zero, which are established by the third law of 
thermodynamics (see §35 below). 

In melting and evaporation the enthalpy increases at a fixed temperature 
(see §62). 

The entropy is found from the values of the heat capacity: 





T Ty 
C,dT 
= p = 
s= | 2 tsi fcamr+s,. 
Ti Tı 
The value of S is obtained by the graphical integration of the curve Cy = 
f(n T). The change in the entropy is 


T 
AS=S(T)-S(T,)= f CpdinT. (33.3) 
Tı 


The value of S(O) is determined by the third law of thermodynamics (see 
§35). 

To find the entropy of a liquid and vapour it is necessary to take into 
account the change in the entropy as the latent heat of melting and vaporiza- 
tion is absorbed (see §62). 


The dependence of the entropy on the pressure is determined from 
formula (30.8): 


chy = OV = -QV 
Integrating, we find 


p 


S(T.p)=— | «Vdp+S(Tp;) (33.4) 
Pı 


144 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


or, for the change in the entropy, 
p 
S(T p) -S(T.pi)=- f aV dp. (33.5) 
Pi 


Knowing the equation of state V = V(T,p) one can find the coefficient of 
thermal expansion a and, by means of formula (33.5), the dependence of the 
entropy on the pressure. 

Knowing H, S and the equation of state one can find all the other ther- 


modynamic potentials: 
E=H-pV, F=E-TS, G=FtpV. 


Thus, in order to find all the thermodynamic potentials of a given sub- 
stance it is necessary: 

(1) to measure the heat capacity over the entire temperature interval of 
interest, 

(2) to measure the coefficient of thermal expansion, and 

(3) to determine the equation of state. 

H, E and S are determined to within constants representing the value of these 
quantities at a certain temperature, while F and G are determined to within 
a linear function of the temperature. 

In conclusion we should discuss the thermodynamic potentials of chemical 
compounds. As a chemical compound is formed changes in the enthalpy AH 
and the entropy AS = 7-1AH take place (§67). Correspondingly, the ther- 
modynamic potentials of chemical compounds are the sum of their values for 
the initial substances and their changes in the course of the reaction. 

The introduction of thermodynamic potentials allows one, in principle, to 
determine the thermodynamic potentials of complex substances, if the 
thermodynamic potentials of the elements constituting them and the heats of 
the chemical reactions of formation of the compounds are known. 


§34. Throttling 


The process in which a gas initially occupying a volume V} at a constant 
pressure pı passes out of its container into a container with a volume V3 
at a constant pressure p>, plays an important role in contemporary tech- 
nology. This process is called the throttling process or the Joule-Thompson 
process. 


§34 THROTTLING 145 


The throttling process is one of the basic methods óf obtaining low tem- 
peratures in contemporary cryogenic techniques. In practice the throttling 
process is carried out by slowly forcing the gas from one container into 
another through a system of thin capillary tubes offering a great hydro- 
dynamic impedance to the flow of the gas. This great hydrodynamic im- 
pedance ensures a small macroscopic velocity of motion of the gas. The 
forcing through of the gas is carried out in adiabatic conditions, for which 
the apparatus is enclosed by thermally insulating material. 

Since no heat is supplied to the system, and the energy dissipation due to 
friction can be disregarded (in view of the small velocity of motion of the 
gas), the change in the internal energy of the gas AE is equal to the mechanical 
work done on the gas: 


AE=E,—E£,=W, 


where Æ} and £ are respectively the energy of the gas in the initial and final 
states. The latter is made up of the work of compression done on the gas ata 
pressure pı (from the initial volume V} to the final volume equal to zero) 
and the work of expansion produced by the gas at a pressure po (from the 
initial volume equal to zero to the final volume V3), i.e. 


0 V2 
= -( J nav f p2aV) = -(2¥2-p.V1)- 
0 


Vy 


Hence for the throttling process one can write that 
E; tpi Vi =E2+p2V2, 
or 
Hı = Ha Q (34. 1) 


Thus, the throttling process represents a process at constant enthalpy of the 
gas. 

Assuming p; and p3 to be very close to each other (and p2 < p1, Ap < 0), 
we find the change in the temperature of the gas as the pressure changes in 
the throttling process. This change is characterized by the derivative u = 
(ƏT/ðp)y called the Joule—Thompson coefficient. The value of the latter has 
been found above (see (31.13)). We write (31.13) in the form 


146 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 








-( an) b (H/əp)r _ (0(G+TS)/dp) 
EN p/u GHT) G 
V + T(ðS/ðp)r -TƏ V/ƏT)p + V 
p p 


We apply formula (34.2) first of all to an ideal gas. In this case, obviously, 


ƏT 
s] 0 
a Ga) 


When an ideal gas is throttled no change in its temperature takes place. The 
meaning of this is obvious: the internal energy of an ideal gas does not depend 
on the volume and does not change as the gas expands. The situation is dif- 
ferent in the case of a real gas for which the energy of interaction between 
molecules and, consequently, the internal energy, depends on the volume. 
The theory of a real gas will be expounded in §46—47. 

In the approximation of eq. (46.18) we find 


=TAV/AT), +V _ N[B-T(d$/4T)] 


Cp 2Cp 





where ßĝ is the constant contained in the Van der Waals equation. Since 
C, >0, the sign of u is determined by that of the numerator. From the 
definition (46.14) of B we find 


d lk 
o-r g eaa (1-9) Jor. (34.3) 


At high temperatures in the region of attraction r>d, u(r)&kT and 
e-WkT ~ 1. In the region of repulsion r<d, u(r) > kT and e WT < |, 
In this case the entire integrand in the expression (34.3) is negative, and 
B — T(dB/d7)< 0. At low temperatures in the region of attraction r>d, 
lu(r)| >kT and u(r)/kKT> 1. In this case the integrand is positive, and 





B — T(dB/dT) > 0. 
{ Thus, in throttling at high temperatures u < O and the temperature of the 
gas increases, whereas in throttling at low temperatures u > O and the tem- 


perature of the gas decreases. At a certain temperature, called the inversion 
point of a given gas, its coefficient u = 0. Carrying out the throttling at temp- 
eratures lower than the inversion point, one can cool gases to very low temper- 
atures. Throttling is one of the most widely used methads of cooling gases. 


§35 THIRD LAW OF THERMODYNAMICS 147 
§35. The third law of thermodynamics 


Consider the behaviour of a certain macroscopic (thermodynamic) system 
at very low temperatures. We assume that the system is in a state of statistical 
equilibrium with an energy €, so that its entropy is determined by the Boltz- 
mann formula. Let the possible values of the energy of the system (its 
energy levels) from a sequence €g, €], €2, €3, -.-, Where €g is the lowest pos- 
sible energy (the normal level of the system), and €}, €23, €3, ..., are excited 
energy levels. The energy levels get closer together very rapidly as the ex- 
citation energy increases. However, the fact that the spacing €} — Eg = Ae 
between the normal level and the first excited level is a finite, if only an 
extremely small quantity, is of great importance. 

If the temperature of the system is sufficiently low, so that the thermal 
energy, KT, is considerably smaller than the spacing between the lowest level 
and the first excited level, i.e. KT < Ae, then the thermal excitations of the 
system are insufficient for the system to get into the state ej. Hence ata 
very low temperature the system must be in the state with the lowest energy 
€g- The thermodynamic energy of the system is obviously equal to 

Eg =€ > (35.1) 
and does not depend on the temperature (for T < k™! Ae). Hence it follows 
that the heat capacity of the system at constant volume is 


_(aE\ _(®€0\ _ 
cv=(35) (5), 2°: (35.2) 


Now we find the entropy of the system. According to the Boltzmann for- 
mula the entropy is equal to 


S=kinQ,, (35.3) 


where {9 is the number of states of the system with the energy €g. But at 
absolute zero an equilibrium system is in a quite definite state the energy of 
which is exactly equal to Ey. We know, however, that if the energy of the 
system is exactly defined, then by this very fact the state of the system is 
also defined. Hence the number of states with energy €g is simply equal to 
unity *. Then from formula (35.3) it follows that the entropy of the system 
at absolute zero is equal to zero: 


* For footnote see next page. 





| Í 


148 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 
S=0 as T=) (35.4) 


The condition (35.4) was first established by Nernst, and. is called the 
third law of thermodynamics, or the Nernst heat theorem. It was introduced, 
not statistically, but on the basis of an analysis of experimental data, in 
particular, the thermal effects of chemical reactions at low temperatures. 

The condition (35.4) is not a consequence of the first and second laws of 
thermodynamics. Its role in contemporary thermodynamics is very important. 

As we have pointed out already in §33, in order to determine the values 
of thermodynamic potentials from empirical or statistical data it is necessary 
to know their value at absolute zero. The third law of thermodynamics allows 
one to do this. 

It should be stressed that the third law of thermodynamics is closely as- 
sociated with the quantum character of the system. If the system considered 
obeyed the laws of classical mechanics, then its energy would change con- 
tinuously. Hence, however low the temperature T may be, the energy of 
thermal excitation kT of the system would be infinitely large in comparison 
with the infinitesimal small spacing between energy levels in the classical sys- 
tem. To a finite energy interval kT there would correspond an infinitely large 
number of possible states Q. The entropy would be large at as low (but 
finite) a temperature as one wished. 

The quantum character of real systems, which is shown very weakly at 
high temperatures, assumes, as we see, paramount importance at very low 
temperatures. This is in complete agreement with the general propositions 
stated in §1. 

In addition a few words should be said apropos the constant entropy. 
Although we write S=0 as T> 0, in fact formula (35.4) should be written 
in the form (see § 24) 


S> const as à T>0. (35.5) 


The constant in formula (35.5) cannot be determined, since it is an arbitrary 
integration constant. 

The value of the constant in (35.5) does not depend on the pressure, 
volume and other parameters characterizing the state of the system. Whatever 
the state in which the substance may be — in the form of a chemical com- 


* If the state with energy €q is accidentally degenerate and the energy €g is possessed 
by several states, then the situation is not changed. If 29 (€Q) is not a large number, then 
kin 20(e9) is practically equal to zero, in view of the extreme smallness of k. 


§35 THIRD LAW OF THERMODYNAMICS 149 


pound or a pure element, with a high or low density and so on — the value of 
this constant will be the same. 

The difference between the entropies of two thermodynamic states of a 
system, which differ by different values of the parameters, tends to zero as 
T> 0. Thus, for example, the entropy Sı of a mixture of two moles of 
elements A and B is equal to the entropy S% of one mole of their compound 
AB as T> 0. 

Such a statement follows from those experimental data which established 
the third law of thermodynamics. One rather frequently encounters the state- 
ment that by means of the third law of thermodynamics the integration con- 
stant in the formula for the entropy can be determined and that by this very 
fact its absolute value can be determined. The reasoning of this paragraph 
and §24 shows clearly that the absolute value of the entropy has no physical 
meaning: the entropy is determined in essence only to within an arbitrary 
constant. 

The meaning of the third law does not lie in the fact that it allows one to 
find the absolute value of the entropy but in the fact that it establishes the 
constancy of the entropy as T > 0 (in the sense of the independence from the 
parameters which characterize the state). The value of this constant can be 
chosen as zero entropy. 

The third law of thermodynamics is often formulated as the principle of 
the unattainability of absolute zero. Such a formulation follows from con- 
dition (35.4). If the entropy of the system is S = 0, then by means of it and 
a cycle of two adiabatic lines S=S, and S = S3 connected by the isothermal 
line T= 0 (on which S= 0) and another arbitrary isothermal line a perpetual 
motion machine of the second kind can be constructed. However, it should 
be noted that the unattainability of absolute zero does not contradict the 
possibility of obtaining in principle temperatures differing from T= 0 by as 
little as one wishes. 

In the foregoing reasoning it has been assumed that the system is in an 
equilibrium state. However, we have not made any assumptions concerning 
its state of aggregation. The expression (35.4) must be equally valid for solid, 
liquid or gaseous systems. 

There exists only one system which remains liquid near absolute zero: 
helium II. As to gases, all ordinary gases at a sufficient pressure are condensed 
long before those low temperatures at which the entropy tends to zero. Hence 
the pressure of the saturated vapour above the solid body is negligible as 
T > 0. However, there exist systems which can conditionally be assumed to 
be gaseous as 70. There is, first of all, the electron gas in metals, the 
properties of which will be considered in more detail in ch. 7 of Part IV. 


150 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 








Table 1 
The entropy of HCI from experimental calorimetric data 
State of Temperature interval Method of determination Entropy 
aggregation (K) (J mol! K!) 
Solid 0-16 Extrapolation of the curve 1.26 


obtained at higher temperatures 


16-—98.36 From measured values of 29.6 
the heat capacity 


98.36 Latent heat 12.1 
Phase transition 
into another 
crystal modification 











98.36—-158.91 From the heat capacity 21.1 
158.91 Latent heat 12.6 
Melting 
Liquid ~~ 158.91—188.07 From the heat capacity 9.84 
188.07 Latent heat 85.89 
paing total 172.3 
Gases x 188.17 Measured entropy 5e 5 y Ta 
188.17 Entropy calculated theoretically 172.9 


with the correction for the fact 
that the gas is not an ideal gas 





However, all ordinary substances pass over into the solid state at a suffi- 
ciently low temperature. It should be noted that, in addition to true solid 
bodies, i.e. crystals, a rather large number of substances are in the solid, or 
amorphous, state. Although amorphous bodies can possess a number of 
properties very similar to those of true solid bodies, they in reality represent 
supercooled liquids. We shall dwell on the properties of supercooled liquids 
somewhat later. 

The validity of the third law of thermodynamics was checked experi- 
mentally by different methods for a large number of substances. Although 
the experimental data confirming the validity of the third law of thermo- 
dynamics are not yet so numerous and versatile as those confirming the 
second law, they remove all doubts as to its validity. The most accurate 
method of checking it is the investigation of chemical equilibria at low 
temperatures. 


§35 THIRD LAW OF THERMODYNAMICS 151 


For a number of substances direct measurements of the heat capacity 
have been carried out, on the basis of which the values of the entropy have 
been found. The latter were compared with those calculated theoretically 
on the basis of the assumption that S = 0 as T> 0. Table 1 is an example of 
such a direct check. 

Finally, an experimental verification of certain consequences of the third 
law of thermodynamics has been carried out for many substances. Thus, for 
example, from the condition (35.4) it follows also that 


(35) =0 as TOF 
op) r 


But 





35) MAIZE: (2) 

(£ Te Ropo DNT 

Hence from the third law of thermodynamics it follows that the coefficient 
of thermal expansion must reduce to zero as T> 0. The measurements of 
this coefficient for a number of crystals (diamond, HCl, Cu and so on) con- 
firm this conclusion. 

However, for a number of substances the measurements of the entropy 
showed the condition S=0 as T= 0 to be unfulfilled. The number of such 
substances is relatively large. They are amorphous bodies, alloys, and a num- 
ber of chemical compounds: CO, NO, H,0 and others. The existence of so 
many exceptions to the third law led to doubts as to its general applicability. 
However the statistical interpretation of the third law allowed one to explain 
the true origin of these violations. 

In deriving the third law we have assumed that our system at absolute zero 
is in an equilibrium state and that it goes over into the state with the lowest 
energy Eg as the temperature tends to zero. However, there are systems 
which are not in an equilibrium state as T~0, to which the third law of 
thermodynamics is inapplicable. These systems possess an entropy different 
from zero at absolute zero. 

We have already said that the relaxation time can vary within an extremely 
large range, and that in a number of cases it can have very large values. This 
holds particularly at low temperatures, when the thermal energy is not large. 
In this case the establishment of an equilibrium state proceeds particularly 
slowly. For concreteness we imagine that our system is made of diatomic 
molecules (for example, CO molecules) forming a regular crystal lattice. In 


152 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 





-> e e M - < 


(a) (b) 


Fig. 111.13 


tig. 111.13 each molecule is presented schematically by an arrow one end ot 
which represents a carbon atom and whose other end represents an oxygen 
atom. Two types of orientation of CO molecules in the lattice are possible, 
shown schematically in figs. IJI.13a and b. In one case there is a completely 
random orientation of the molecules, and in the other case there is a regular 
orientation of the molecules. All properties of carbon atoms are so similar to 
those of oxygen atoms that there is a negligible difference between the two 
forms of the crystal, with regularly and chaotically oriented molecules. Their 
symmetry and basic properties are completely identical. The difference 
between the energies of the two states is very small. The state with the regular 
orientation of the molecules corresponds to total equilibrium as T > 0, for the 
energy of this turns out to be somewhat lower than that of the state with the 
random orientation. However, at higher temperatures the random orientation 
of the molecules becomes the equilibrium orientation. CO crystallizes at a 
relatively high temperature, when the random orientation is the equilibrium 
orientation. As the temperature is lowered down to T < k-lAe the random 
orientation becomes non-equilibrium, so that the molecules should pass over 
into the state with the regular orientation. However, the transition of the 
molecules from a random distribution to a regular distribution at low tem- 
peratures proceeds so slowly that even for reasonably large observation times 
total equilibrium does not have sufficient time to be established. The system, 
at as low a temperature as one wishes, will be in a state with a random 
orientation of the molecules. This state will be the one with the lowest 
realizable energy. It is not an equilibrium state, since for a regular orientation 
the energy of the system would be even lower, but at the same time at low 
temperatures, when the velocity of the process of orientation becomes very 
small, the system can be in this state for an extremely long time (practically 
as long as one wants). The state with a random distribution of molecular 
orientations is, thus, a metastable state. The reasoning presented above is in- 
applicable to a system in a metastable state, and hence its entropy does not 
satisfy the condition (35.4). 

Finding the entropy of a system with a completely random arrangement 
of molecules presents no difficulty. Each molecule can, with equal proba- 


§36 STATISTICAL CHARACTER OF SECOND LAW 153 


bility be in two states differing only by their orientation. If the system con- 
tains M molecules, then the total number of states is obviously equal to 
Qo = 2% (in view of the simplicity of the problem there is no need to ex- 
press this number in terms of the phase volume). The entropy of the system 
at absolute zero is equal to 


Spo =k InQy =k In W = Nk In2= 5.76 J mol“! K- . 


The measured value of the entropy of CO as T > 0, turns out to be equal 
to 4.6 J mol"! K-!. We see that the entropy of the system is indeed dif- 
ferent from zero, its value being close to the theoretical one. 

The third law of thermodynamics is of very great importance for finding 
the values of thermodynamic functions. In practical use it should, however, 
always be borne in mind that it refers only to systems in an equilibrium 
state, and that it is inapplicable to metastable systems. The example presented 
is not the only one. In §54 we shall return to the problem of the behaviour 
of systems at low temperatures and shall present other examples of apparent 
deviations from the third law of thermodynamics. In any case, the possibility 
of applying the third law of thermodynamics is often not obvious and calls 
for great care. Before the appearance of the statistical derivation of the third 
law, when the limits of its applicability were still not understood, an in: 
considerate application of the third law to metastable systems led to the 
contradictions mentioned above. 

It should be noted in addition that, although the third law of thermody- 
namics is a very important proposition, the degree of its importance for 
science can scarcely be compared with that of the second law. In this sense 
the term “the third law” appears to be not quite apt. 


§36. The statistical character of the second law of thermodynamics 


In the preceding paragraphs we have established that the law of increase 
of entropy which in thermodynamics represents a direct generalization of ex- 
perimental results, assumes a new, more profound, and at the same time 
clearer meaning in the light of the arguments of statistical physics. From the 
point of view of statistical physics the law of increase of entropy represents 
an expression of statistical laws shown in systems consisting of a very large 
number of molecules. In these systems, owing to the forces of intermolecular 
interaction, transitions from less probable states into more probable ones 
always take place until the system comes into the most probable state, i.e. 


154 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


the state of total statistical equilibrium. This transition from a non- 
equilibrium state into an equilibrium state proceeds via complex processes 
in which an enormous number of molecules take part. The mechanism of 
the establishment of equilibrium and the character of the processes involved 
in it depend in many respects on the actual properties of the system. 

The establishment of equilibrium (i.e. molecular disorder) in an ideal gas 
serves as the simplest example. Molecular disorder is established as a result of 
the collisions of molecules with the wall of the container and with one 
another. 

It is not the way in which equilibrium is established in a system that will 
interest us now, but only the fact that equilibrium will, without fail, be 
established eventually. We have seen that the statistical formulation of the 
second law of thermodynamics differs from the thermodynamic one in a 
very important respect: in the former the following words are employed: 
“the most probable trend of processes”, whereas in thermodynamics one 
speaks simply of the trend of processes. The formulation of statistical physics 
has a considerably less categorical character. It does not exclude at all, but, on 
the contrary, envisages the possibility of processes in the course of which the 
system passes over from a more probable state into a less probable one, and 
in which its entropy decreases. The existence of such processes, called fluc- 
tuations, is completely denied in the thermodynamic formulation of the 
second law. 

For instance, imagine that a gas occupies one half of a free volume. 
According to the laws of thermodynamics the gas must expand and occupy 
the entire volume, the expansion being accompanied by an increase in the 
entropy. From the point of view of statistical physics such a behaviour of the 
gas is the most probable one. However, the possibility that the gas will not 
expand but compress is not excluded. In a macroscopic system the latter 
process has a probability which is negligibly small in comparison with that of 
the process of expansion. Hence in practice in macroscopic systems the first 
process will always occur. 

Fluctuations occurring in a system which is already in an equilibrium state 
are of great importance. Imagine the gas occupying the entire colume with a 
uniform density and in an equilibrium state. If this gas is not subjected to 
an external action, then from the point of view of thermodynamics it will 
remain in this state for an indefinitely long time. Statistical physics states 
that, although for an overwhelmingly large part of the time the gas will 
occupy the entire volume and be in the equilibrium state, the possibility of 
fluctuations in the course of which the gas will spontaneously go out of the 
equilibrium state is not excluded. In particular, the gas can pass over spon- 


§36 STATISTICAL CHARACTER OF SECOND LAW 155 


taneously into a state in which it occupies only a part of the entire volume. 
The probability of such a transition is determined by the Boltzmann formula. 

We shall not dwell now on the analysis of concrete examples of fluctua- 
tions, since ch. 8 is devoted to this important phenomenon. We stress only 
that experiment has completely confirmed the predictions of statistics as to 
the existence in nature of such spontaneous processes accompanied by a 
decrease in entropy. But the question naturally arises as to whether or not 
the statistical formulation of the second law contradicts the purely ther- 
modynamic one. Does it follow from the statistical formulation of the second 
law that the construction of a perpetual motion machine of the second kind 
is a difficult but in principle soluble problem? For its solution can use be 
made of fluctuation processes accompanied by a decrease in the entropy? 
This problem was a subject of discussion over the course of a number of years, 
and its solution turned out to be very fruitful for the development of the 
basic propositions of statistical physics. But before giving an answer to this 
question it is necessary to discuss another, no less complicated question, 
logically preceding the former: in fact, how could it happen that, in con- 
sidering molecular processes, we arrived at the idea of irreversibility? 

It is well known that the laws of mechanics are strictly reversible. This is 
seen if only from the fact that the equation of classical mechanics 


m daj =F 
dt? 


remains unchanged when the sign of the time is changed. A total reversibility 
reigns also in the realm of intra-atomic processes: from the laws of quantum 
mechanics there follows the principle of microscopic reversibility, stating 
that for all microscopic atomic or molecular processes the probability of 
direct processes is equal to that of reverse processes. Thus, there is no 
irreversibility in the basic laws of molecular processes. All the processes are 
strictly symmetric with respect to the future and past. On the other hand, 
statistical physics, based on molecular laws, leads to the appearance of irre- 
versibility. At first sight it may seem that the laws of statistical physics 
contradict those of molecular motion, on the basis of which they were 
derived. In reality, however, this is not so. An increase in the entropy takes 
place when the system was initially in a certain non-equilibrium state. Then 
the most probable behaviour of the system in the course of time will be its 
transition into an equilibrium state. In fig. III.14 the time is plotted on the 
x-axis, and the entropy is plotted on the y-axis; the solid curve represents this 
most probable transition, while the dotted line represents the improbable 


156 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


N / 
4 
t<o (0) t>0 t<o o t>0 


Fig. 111.14 Fig. III.15 








transition into a state with lower entropy *. Here the direction of time, the 
difference between the initial and final instants of time, appears completely 
obvious. However, if one thinks more carefully over this reasoning, then it 
can be seen that we always begin the consideration deliberately with a non- 
equilibrium state of the system. A certain asymmetry is concealed in the very 
statement of the problem. We say: in the beginning a non-equilibrium state is 
given; what will happen afterwards with the system left to itself? Let us try, 
however, to imagine how this initial non-equilibrium state could arise. It 
could arise either by an intervention in the system from without, or 
spontaneously in a closed system. In the first case it is clear that the absence 
of symmetry in the behaviour of the system with respect to the past in no 
way contradicts the reversibility of the laws of molecular processes. It is 
associated with the asymmetry of the conditions of the problem; in the past 
the system underwent actions from without, while in the future it is left to 
itself. A more important, but less obvious case is that when the system under- 
went no action from without but came into a given state (the initial one in 
our previous consideration) spontaneously, always remaining closed. The 
question arises as to what was the state occupied by the system before it 
came into the given state. It could come into a given non-equilibrium state 
from an even more non-equilibrium state or, on the contrary, from an 
equilibrium state. But every macroscopic system spends the major part of its 


* It should be noted that the drawings presented here and below should be considered 
as a scheme illustrating the properties of the entropy. In reality, in a system consisting 
of parts the value of the entropy is strictly defined in the course of a certain finite time 
interval, but not at every given instant. Hence the graph S(t) should not be taken 
literally [see B.G.Levich, Vvedenie v statisticheskuyu fiziku (Introduction to statistical 
physics) (Gostekhizdat, Moscow, 1954) §35; F.Reif, Statistical and thermal physics 
(McGraw-Hill, New York, 1965) §15.18]. 


§36 STATISTICAL CHARACTER OF SECOND LAW 157 


time in a state of statistical equilibrium. If we ask what state the system was 
in at ¢<0, then from the most general considerations it is clear that with a 
high probability it was in an equilibrium state. Hence a system comes most 
often into a given non-equilibrium state from an equilibrium state. In other 
words, in order that a system may come into a given non-equilibrium state 
at t = 0, it must undergo a fluctuation at ¢ < 0. In fig. III.15 the solid line re- 
presents the most probable process bringing the system into the state which 
was the initial one for the process presented in fig. III.14. Of course, it is not 
excluded that the non-equilibrium state given at ¢="0 arose from another, 
even more non-equilibrium state, as is shown by the dotted line in fig. III.15. 
But, since the probability of finding a closed system in a non-equilibrium 
state is low, such a case is improbable. Now we match the two drawings, i.e. 
consider the whole process in time. Then we obtain the curves shown in fig. 
III.16, which are completely symmetric with respect to the future and past. 


S 


7 
7 IN 


t<o (0) t>0 


Fig- III.16 


The asymmetry of the second law — the indication that the entropy will in- 
crease in the future — thus turns out to be associated with the asymmetry of 
the initial condition — the definition of the system in a non-equilibrium state 
at the initial instant. 

We now imagine that the state of a closed system is defined as the 
equilibrium state, corresponding to the maximum value of the entropy (fig. 
II1.17). According to the propositions of thermodynamics the system will 
remain in an equilibrium state, during the subsequent time, and its entropy 
will remain constant. Statistical physics allows the possibility of a spontaneous 
transition, a fluctuation, of the system from the equilibrium state. As we have 
seen above, the probability of a fluctuation decreases sharply with its mag- 
nitude. Therefore, in order that we may notice the fluctuation, it is necessary 
to observe the system for a sufficiently long time interval, in any case much 
longer than the relaxation time r. Moreover, the probability of a fluctuation 
depends very much on the size of the system (the number of particles in it). 


158 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 





Fig. III.17 


If we observe the behaviour of a closed system which is in an equilibrium 
state at the instant r= 0, then we shall see that the entropy of this system 
will decrease (the segments ab and de on the curve of fig. III.17) and increase 
(the segments be and ef) equally frequently. This agrees, after all, with the 
aforesaid; the process shown in fig. III.16 is, in essence, a particular case of 
the process shown in fig. III.17, and corresponds to one of the cusps in the 
latter. 

Thus, we see that if one renounces asymmetry in the statement of the 
problem, i.e. if an improbable (non-equilibrium) state of the system is not 
given at the initial instant, then the law of increase of entropy loses its one- 
sided meaning and becomes symmetric with respect to the future and past. 
This can also be formulated in the following way: in the course of a suf- 
ficiently long time interval the number of transitions from an equilibrium 
state into a non-equilibrium one in a closed system is equal to the number of 
reverse transitions from a non-equilibrium state into an equilibrium one. This 
equality arises because the number of the former is equal to a large number 
of initial (equilibrium) states multiplied by a low probability of the fluctua- 
tion. The number of the latter transitions is equal to a small number of 
initial (non-equilibrium) states multiplied by a high probability of the transi- 
tion into an equilibrium state (relaxation). In a system which is always closed 
the entropy increases and decreases equally frequently. 

In practice, however, one most often has to deal with systems which are, 
at the initial instant, in a given non-equilibrium state. In the case of a system 
which is always closed every non-equilibrium state can be considered as a 
fluctuation. If we observe the successive changes of state of a system during 
the course of a time comparable with or smaller than the relaxation time, 
then the most probable trend of the processes will be an increase of the 
entropy. In this case the apparent asymmetry arises because of the asymmetry 
in the statement of the problem. 

It is also possible that the system came into a non-equilibrium state as a 
result of an external action, which thereupon ceases. In this case the system 


§36 STATISTICAL CHARACTER OF SECOND LAW 159 


is insulated only from a certain instant. Its entropy will henceforward in- 
crease. Here the asymmetric trend of the entropy is associated with the 
presence in the past of an action on the system from outside. 

We see that the difference between irreversible and reversible processes 
becomes relative, and that there is no contradiction between the reversibility 
of the laws of mechanics and the existence of irreversible processes in statis- 
tics. In view of the relative character of the ideas of reversibility and irre- 
versibility the necessity arises for a more clear criterion of irreversible and 
reversible molecular processes. In order to arrive at such a formulation, we 
shall consider one more concrete example. 

Imagine a certain volume in a container occupied by a mixture of two 
gases. Let, the following non-equilibrium state of the gas be given at a 
certain initial instant: in the volume considered the composition of the gas 
deviates by 1% from a homogeneous composition. If the gas is left to itself, 
then with a probability of the order of unity after the lapse of the relaxation 
time the molecules will be mixed up and the system will pass over into a state 
with a uniform density. The entropy of the gas will increase. From the point 
of view of pure thermodynamics we have a classical example of an irre- 
versible process. However, let us analyse this process more carefully from the 
statistical point of view. The system will indeed pass over spontaneously 
from the non-uniform to a uniform distribution of molecules in the mixture. 
But it cannot be stated that the initial state will never again repeat itself, 
and that the system will always remain in a state with a uniform distribution 
of molecules. On the contrary, fluctuations will occur in the system, as a 
result of which the homogeneity of the gas composition will be violated. 
After the lapse of a sufficiently long time, in the system left to itself, there 
will necessarily occur a fluctuation of such a magnitude that the deviation 
from uniformity in the volume singled out will reach 1%, and the system 
will come back into the initial state. The return of the system into the 
initial state shows that the process of mutual diffusion of gases cannot be 
considered as irreversible. The process considered can be called a reversible 
process. 

One would think that we arrive again at a complete contradiction with 
pure thermodynamics. However, the practical importance of this contradic- 
tion depends on the scale of the phenomenon and the time needed for the 
system to come back into the initial state. This time can be assumed, roughly 
speaking, to be inversely proportional to the probability of a fluctuation of 
the corresponding magnitude. Hence it can be assumed that the time needed 
for the system to come back into the initial state is longer, the larger the size 
of the system and the greater the difference between the initial and equilib- 


3! 


160 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


rium state. We shall denote this time, usually called the recovery time, by 7*. 
Then it is obvious that, if the total observation time T is small in comparison 
with 7*, the system will not manage to come back into the initial state during 
the observation time. In this case the process accompanied by an increase in 
entropy is irreversible. But if, on the contrary, the total observation time T 
is large in comparison with the recovery time 7*, then during the observation 
time the system will without fail come back into the initial state. In this case 
the same process must be considered as reversible. Such a treatment was first 
developed by M.Smolukhovskii. 

Thus, the relation between the times 7* and T is a decisive factor in the 
criterion of reversibility and irreversibility. 

To get an idea of the order of magnitude of the recovery time 7* we shall 
calculate it for a simple system. At the instant ¢ = 0 let an ideal gas, confined 
initially in the left-hand half of a container, occupy the entire container. 
We shall find the time needed for all M molecules of the gas to gather again, 
in the left-hand half of the container as a result of molecular motion with a 
probability close to unity, for example, equal to 0.9. 

The probability, for one measurement, that one of the molecules will be 
found in the left-hand half of the container is equal to wD =%. Corre- 
spondingly, the probability, for one measurement, that two molecules will be 
found in the left-hand half of the container is equal to w?) = (4)*, while the 
probability of finding in one measurement NV molecules in the left-hand half 
of the container is w =W =2. 

The probability that one will not find in one measurement M molecules 
in the left-hand half of the container is obviously equal to 1—2. The 
probability that one will not find in n measurements NV molecules in the 
left-hand half of the container is equal to (1—2~V)". 

The probability that after n measurements M molecules will be found in 
the left-hand half of the container is equal to 


wW) = 1 — (1-2) . 
Assuming, by our condition, that w) = 0.9, we find 

n ln (1—2) = In 0.1. 
Since N is large, 2 <1, so that In (1—2) ~—2-N. Then we have 
n/2N = 1. 


If the measurements are carried out every Ar sec, then V molecules will 
be found with a probability 0.9 to be gathered again in the left-hand half of 


§36 STATISTICAL CHARACTER OF SECOND LAW 161 
the container after the lapse of a time 
Te =nAt~2Nat. 
If, for example, the measurements are carried out every Aż = 1 sec, then 
T* =2N sec. The recovery time increases very rapidly with increasing number 
N of particles in the gas. 


An idea of the numerical values of r* is given by the figures in table 2. 


Table 2 





N 5 10 100 105 10'° 


7*, sec 32 1024 1032 2105 32102 





We see that, if the number of particles in the system is sufficiently small, 
the time in which the system comes back into the initial state — the recovery 
time — can be observed in the course of measurable time intervals. 

On the contrary, when the number of particles in the system is large, the 
recovery time becomes immense. A system with a large number of particles 
cannot be expected to come back into the initial state in really observable 
time intervals. It should be noted that experiments have confirmed the 
numerical values of recovery times for the gathering of a small number of 
colloidal particles in small volumes (see §56). 

From the above it is clear that the point of view developed regarding 
reversibility and irreversibility does not in practice contradict the conclusions 
of thermodynamics at all. Recovery times for processes on the macroscopic 
scale turn out to be so large that for any practical observation time the 
inequality T* > T is always satisfied. Hence processes which are irreversible 
from the thermodynamic point of view can be considered as irreversible also 
from the statistical point of view. 

We now can pass on to the discussion of the second question put at the 
beginning of this section: can a perpetual motion machine of the second kind 
be realized by making use of the phenomenon of fluctuations? 

Imagine that we have a certain mechanism which can be used to produce 
useful work from a fluctuation occurring in a certain system. For concreteness 
we imagine that this mechanism is a piston which is set in a one-way motion 
by the density fluctuations occurring in the volume of the gas under the 
piston. If such a mechanism could be realized in practice, then useful work 
could be obtained systematically at the expense of the thermal energy of the 


162 STATISTICAL AND PHENOMENOLOGICAL THERMODYNAMICS Ch. 4 


medium, i.e. a perpetual motion machine of the second kind could be con- 
structed. However, it is easily shown that the construction of such a machine 
is impossible. 

As a matter of fact, whatever the design of the machine, the piston and the 
other parts as well as the gas or another medium consist of atoms or mole- 
cules. Hence the operating mechanism, as well as the medium, will undergo 
fluctuations. The fluctuations of the mechanism and the medium are inde- 
pendent of each other and occur, generally speaking, at different instants and 
in different directions. Let, for example, the piston move and do work as the 
gas expands. But the piston itself also undergoes fluctuations and hence dis- 
placements in the direction opposite to that in which it moves as the gas 
expands. Owing to the independence of fluctuations in the gas and in the 
mechanism the displacement of the piston averaged over time turns out to 
be exactly equal to zero. Consequently, the mean work produced by the 
piston is also equal to zero. 

These qualitative considerations are confirmed by the quantitative calcula- 
tions for different schemes of such operating mechanisms. 

Thus, a systematic production of useful work at the expense of small 
fluctuations occurring in a certain operating mechanism turns out to be in 
principle impossible. Similarly, it is impossible to obtain useful work at the 
expense of single large fluctuations: the probability of large fluctuations 
decreases incomparably more rapidly than the value of the useful effect 
increases. 

This very fact proves that the construction of a perpetual motion machine 
of the second kind producing useful work systematically at the expense of 
fluctuations is in principle impossible. The classical formulation of the second 
law: “it is impossible to construct a perpetual motion machine of the second 
kind, i.e. a device which, in the course of a long time, would use heat at a 
lower temperature and thus serve as a source of a useful work” remains valid. 

In conclusion we must dwell on the problem of the so-called “thermal 
death” of the universe. 

The theory of the thermal death of the universe put forward by Clausius 
consists of the following: For the present the universe is not in a state of 
thermal equilibrium; there exist in it temperature differences, motion, and 
so on. However, since the universe represents a closed system to which the 
laws of statistics and thermodynamics are applicable, after the lapse of a 
sufficiently long time interval all temperature differences existing in the 
universe will be smoothed out, and the motion will cease. The universe will 
pass over into a state of total rest — thermal death. 


§36 STATISTICAL CHARACTER OF SECOND LAW 163 


The teaching of Clausius was subjected to criticism by many physicists, 
in the first place Boltzmann. 

At present it is ascertained without doubt that there are no grounds for ` 
applying the laws of statistical physics to the universe changing in time. 
The treatment of gravitational phenomena within the framework of the 
general theory of relativity already shows that the thermodynamic properties 
of systems on a cosmic scale must differ radically from those of ordinary 
closed systems. In thermodynamics based on the general theory of relativity 
it turns out that the entropy of systems on a cosmic scale cannot tend to 
and reach a maximum value, and that thermal equilibrium cannot be 
established in them *. 

A more complete study of the laws of the behaviour of the universe and, 
in particular, its thermodynamic behaviour, is a matter for the future. But 
even now it is clear that they are far more complex than the properties of 
ordinary macroscopic molecular systems, and that the application of the 
laws of ordinary thermodynamics to the universe is inadmissible. 


*See R.Tolman, Relativity, thermodynamics and cosmology (Oxford University 
Press, Oxford, 1934). 


| 











Ideal Gases 


§37. The distribution function for ideal gases 


In this and following chapters we shall consider the application of the 
general theory to concrete systems. In the first place ideal gases will be 


considered. 
The partition function of an ideal monatomic gas has the form 


Z= De MkT e,), (37.1) 


where €, is the energy of the gas as a whole, and the summation is carried 
out over the energy levels of the system. 

Since the spacings between the energy levels of the gas as a whole are very 
small in comparison with kT, the summation over the energy levels can be 
replaced by integration. Thus, 


Z= f e-eKT dQ, (37.2) 


Let us find the number of states, dQ, of a system with given energy. Since 
all particles of an ideal gas are independent, one can write that 


164 


§37 DISTRIBUTION FUNCTION FOR IDEAL GASES 165 


acai macy—p lee es 
i 


h3N 


where the product is taken over all coordinates and momenta of the particles. 
The prime in the product denotes that in its formation one has to include 
only those terms which correspond to different states of the system as a 
whole. 

Substituting dQ into (37.1) we find 


rea I ; dp;dq; 
hN 


Zi (37.3) 

In order to find the integration range in (37.3), we shall consider the case 
of two uniformly moving particles. Let p} denote the momentum of the 
first particle and py the momentum of the second particle. The integration 
in (37.3) is carried out over all values of the momenta which each of the mole- 
cules can have. We shall consider the two states of the system shown in 
fig. 111.18 by dotted lines. In the first state the first molecule has a momen- 





a Pz 
Fig. 111.18 


tum equal to a, and the second molecule has a momentum equal to b. In the 
second state, on the contrary, the second molecule has momentum a, and 
the first molecule has momentum b. It can be said that the second state dif- 
fers from the first by the fact that the momentum of the first particle is 
replaced by that of the second particle, and vice versa. In other words, the 
representative points of the two molecules, or for brevity, the molecules, are 
mutually exchanged in phase space. 








PF 


166 IDEAL GASES CRYS. 


In carrying out the integration in (37.3) we takeʻinto account separately 
each of these and similar states of the system of two molecules. Integrating 
with respect to p} at a fixed value of p2, and thereupon with respect to pz 
at a fixed value of pı, we pass through a sequence of values of the momenta 
(pj.P2) and (p2,p 1), differing from each other by the mutual exchange of 
the particles in phase space. In other words, we assume that the states “the 
first molecule has a momentum pj, and the second molecule has a momentum 
p2” and “the first molecule has a momentum po and the second molecule has 
a momentum p,” are two different states. However, if the two molecules are 
identical, then these states are equivalent. But from the proposition of quan- 
tum theory on the complete identity of elementary particles it follows that 
the two states correspond to one and the same physical state of the system. 
A physical state of the system is characterized by the fact that one particle 
has a momentum pj, and the second particle has a momentum p3. It does 
not matter which of the particles has the momentum p} or p3, since the two 
particles are completely identical. The states (p),p2) and (p2,p)) are not 
two equivalent states, but one and the same state of the system. Hence we 
would make a mistake by taking the two states (p;,p2) and (p2,p)) as inde- 
pendent states in the integration in (37.3). In reality both of these corre- 
spond to only one physical state of the system. In order to avoid this mistake 
one should take into account only one of the states: either (p1,p2) or (p2,p1). 
The integration with respect to py and p in (37.3) should be carried out 
not over all their possible values but only over the values corresponding to 
physically different states of the system. For this one can, for example, 
integrate not over the entire phase area (p),p ) but over one half of it, cut 
by the bisector drawn in fig. III.18. However, it is simpler to proceed dif- 
ferently. One can, as before, carry out the integration over all values of p} 
and p2, and reduce by a factor of two the result obtained. This will com- 
pensate for the incorrect doubling of the number of states as they are cal- 
culated. One has to proceed completely analogously also in calculating that 
part of the partition function which includes the integration with respect to 
the coordinates of the two molecules. States differing from each other only 
by the mutual exchange of the molecules in space should be considered not 
as different states, but as one and the same state. 

In general, in calculating the integral over all states of the system consist- 
ing of two particles one has to carry out the integration over all states — the 
coordinates and momenta (p,q) of the first particle and (p,q) of the 
second particle, but the result obtained should be reduced by a factor of two. 
In this case it will be automatically taken into account that the “exchange” 
in phase space of the representative points characterizing the state of each of 
the particles does not lead to different states of the entire system. 


§37 DISTRIBUTION FUNCTION FOR IDEAL GASES 167 


The result obtained can be generalized to arbitrary states of a gas contain- 
ing NV molecules. All states of the gas which differ from each other only by 
the fact that the values of the coordinates and momenta of one molecule are 
exchanged with those of another molecule are identical physical states. It 
can be said that the set of all states which are obtained by the mutual per- 
mutation of N representative points in the phase space corresponds to only 
one physical state. In calculating the total partition function of the gas each 
of these must be taken into account in the integration only once. Hence in 
carrying out the integration over all possible values of the coordinates and 
momenta of gas molecules the result must be divided by the number of the 
mutual permutations of representative points in the phase space. This number 
is, obviously, equal to N!. 

It should be noted that the necessity of dividing the phase integral by N! 
was taken into account in classical statistics. Otherwise the additivity of the 
thermodynamic functions obtained would be violated. However, the full 
meaning of this procedure is seen in quantum statistics after taking into 
consideration the principle of the identity of atomic particles. 

Thus, we can write that 





A [ 1, a9, dp;dq; 

=| | x Lal 100; 

ge Teis coin oe Fi nav ` GLY 
i i 


The Gibbs distribution for an ideal gas has the form 
dp, ---dp3y dq] -.-d 
e EAI ESSENSEN 
Z Z h3NN! 
Correspondingly, for the partition function we obtain 


Zn f elt] Jao ; 


where the integration is carried out over the entire phase space. Writing the 
energy of the gas as follows: 


e=2y cE 
l 


we obtain 


168 IDEAL GASES Ch. 5 


arr le TAAT apd; = = , 


where z is the partition integral for one molecule calculated earlier {formula 
(19.7)] . Substituting its value, we find that 


z= VN | (37.5) 


“Nt h 


We pass on to the calculation of the thermodynamic functions of an ideal 
monatomic gas. 
For the energy of the gas one can write 


E= kT? inZ=3NKr (37.6) 


The energy of the gas is proportional to the temperature and does not depend 
on the volume of the gas, in correspondence with formula (20.14). 
The heat capacity of a monatomic gas turns out to be equal to 


Cy = (hb = 3Nk~ 125 mol"! K-1 , 


For the free energy of an ideal monatomic gas we find, according to 
(32.1), 


= —kT InZ = —NkT In (ae 


3 
J +kTlnN! 
h 


In order to calculate N! for a large M one can make use of Stirling’s formula 
(see Appendix IV). We then have 


eV: ( 2amkT 4] 
F=—NKkT ln [7 (met n2 ) à (37.7) 


From the free energy one can find the equation of state of the gas. In 
correspondence with (32.6) we obtain 





§37 DISTRIBUTION FUNCTION FOR IDEAL GASES 169 
= ON NRE 
p (7,7 y ` (37.8) 


Thus, we arrive at the well-known equation of state of an ideal gas. It should 
be noted that we have found this equation in a purely theoretical way, 
without any reference to experimental data. Only the numerical value of the 
constant k is determined experimentally . 

We shall now calculate the entropy of the gas. According to (29.7) it is 
equal to 


- {WE = eV (2nmkT\3 Sin ae 
sa- (E), ara Jea - 


= Nk In(V/N) + Cy InkT + $Nk + Nkj . (37.9) 





For reasons which will be explained later the quantity 
j=ln (2nm/h2)? 


is called the chemical constant. 

The entropy defined by formula (37.9) is expressed as a function of the 
number of particles and the volume of the gas. 

The entropy is proportional to the number of particles of the gas as it 
should be, by virtue of its additivity. As the volume and number of particles 
of the gas are simultaneously increased by an arbitrary factor, the entropy 
increases or decreases by the same factor. 

It may seem strange that the entropy which we have calculated does not 
contain any arbitrary constant, while, according to what was said in §24, the 
entropy is determined only to within an indefinite constant. In fact, the 
formula does not contain any arbitrary constant because we based the cal- 
culation of F and § on formulae (29.7) and (32.1), in which the arbitrary 
constant of the entropy was chosen as the conditional zero entropy. 

The expression (37.9) for entropy loses its applicability as T> 0. In 
deriving (37.9) we have not taken into account the phenomenon of quantum 
degeneration of the gas, which plays a basic role in its behaviour at very low 
temperatures. This phenomenon will be considered in ch. 10. 

One cannot obtain a correct expression for the entropy, satisfying the 
third law of thermodynamics, without taking into account the phenomenon 
of degeneration. However, it should be noted that there are no ordinary 
gases at such low temperatures as those at which degeneration occurs. Long 


170 IDEAL GASES Ch. 5 


before these temperatures are reached the liquefaction of gases takes place 
for all practical values of the density. 

At very high temperatures formula (37.9) is again inapplicable, since it 
does not take into account the thermal ionization of atoms. Nevertheless, 
the interval of applicability of formula (37.9) is very large: from the lique- 
faction point to several thousand degrees. 

In practice it is often more convenient to express entropy in terms of the 
pressure and temperature. Substituting into (37.9) the value of V expressed 
in terms of the pressure, we find 


S = 5Nk InkT — Nk Inp + $Nk + Nkj = 
=C, InkT = Nk Inp + $Nk + Nkj. (37.10) 


In addition we calculate the Gibbs thermodynamic potential G. On the 
basis of (28.11) and (37.7), we have for G: 


G=F + pV =—3NKT InkT + NKT Inp — NkTj = 
= NKT inp — C,T \InkT — NkTj . (37.11) 
The expression for the heat capacity obtained above can be compared 
with experimental data. The number of monatomic gases is not very large. 
Noble gases and metal vapours are examples. 


Table 3 gives measured values of the heat capacity. 


Table 3 
Heat capacity (at a constant volume) of monatomic gases 








Substance Temperature Cy 
Co) (J mol ™ K) 
291 12.59 
He 93 12.26 
26 12.56 
18 12.64 
Ar 288 12.85 
Na (vapour) 750-920 12.35 
Hg (vapour) 548-629 12.43 





§37 DISTRIBUTION FUNCTION FOR IDEAL GASES 171 


From the table it is seen that the predictions of theory are well justified 
experimentally: the heat capacity of monatomic gases is constant over a 
large temperature interval and has a value which is almost the same as the 
theoretical value: 


Cy ~ 12J mol"! K- . 


In conclusion we shall consider the important problem of gas mixing. 

For simplicity we assume that gas is put into two chambers with volumes 
V, and V3, which are separated in the beginning by an impenetrable mem- 
brane. Then the membrane is removed, and the molecules of both portions 
of the gas begin to interdiffuse. As a result of this process gas mixing takes 
place. We assume that the temperature and pressure of the gases were equal 
before mixing. 

Let us find the change in the entropy as the two portions of the gas mix 
with each other. Here two cases should be distinguished: the mixing of gases 
of a different kind, and the mixing of gases of the same kind. 

We shall begin with the treatment of the first case. 

According to (37.9), the entropies of the different gases before mixing 
are given by the expressions 


50 =N kin + NI f(T) 
1 1 N, 1 , 


y. 
SO = Nok ny, + Nof(T) , 


where f(T) are the terms of the formula for the entropy which do not depend 
on the volume. The total entropy of the system before mixing is equal to 


SOs) Hs On 


After mixing, each of the ideal gases will behave as though there were no 
second gas, and will occupy the entire volume V} + V3. The temperature of 
the mixture will be equal to the initial temperature of the gas. Hence after 
the mixing the entropy of each of the gases will be equal to 


Stal) 
S,=N,k In ayn AUC 
1 


ft 


172 IDEAL GASES Ch. $ 


Vi + Vo 
So = Nok In ——— + Nyf(7). 
N2 


The entropy of a mixture consisting of two non-interacting ideal gases is 
equal to the sum of their entropies, i.e. 


SESS 


The change in the total entropy of the entire system in mixing is equal to 


AS =S —SM= 





Vit Vit Nik1 Vi p2 
EN as A SEN | -Ni "Ny — Nok "WN, = 
ALG) Gv) 
=N,kI1n Te PN In ae 
At given temperature and pressure 
Vi +V (Ny +N 2)kT/p _N, +N2 
Vy N,kT/p IM 


and analogously for (V;+V)/V>, so that the change in the entropy is equal 
to 


N> Ni +N 


Ny + 
peeves nm 1AVoRUD N3 


Thus, the entropy of the mixture is higher than that of the initial gases. 

The process of mixing of two different gases is an irreversible process. 
The origin of this irreversibility is quite understandable. When the membrane 
separating the gases is removed the interdiffusion of the gases takes place. 
Before mixing there existed a “regularity” in the position of molecules: the 
molecules of one gas were in one part of the container, and those of the 
second gas were in the other part of the container. 

When the diffusion completely mixed the two gases, a uniform completely 
random distribution of the molecules occurs, and the probability of the state 
increases. In order to separate the gases it is necessary to do a certain amount 
of work, which in principle can be calculated. 

Now consider the process of mixing two identical gases. Gases can be 
considered as identical in the case where they behave identically in all possible 
external fields. 


§37 DISTRIBUTION FUNCTION FOR IDEAL GASES 173 
The entropy of two portions of one gas before mixing is equal to 


A V2 
Si =N,k In——+ Nak In=—+ (Ni +N2) A(T). 
Ny N 


2 


The entropy of the entire gas after mixing is equal to 
Sp =(N firak 
2 =(W1+N2)k By N (Ni +N) f(T) - 


Then for the change in the entropy we obtain 


Vit Va 


AS = S3 — S; Ral ig +N> 





i Tok 
Nk ln N, Nak In N, 


However, from the equation of state of the gas it follows that, at a constant 
pressure and temperature, 


Hence 
AS=0. 


Thus, the change in the entropy in mixing two portions of one gas is 
indeed identically equal to zero. This result, which is in complete agreement 
with experiment, is closely connected with the assumption of the mutual 
identity of all particles of the given gas. Owing to this identity their mutual 
mixing is nota physical event. In mixing two portions of one gas at a constant 
pressure and temperature the distribution of molecules in the overall volume 
turns out to be uniform and random, and no interdiffusion takes place. It 
should be stressed that molecules or atoms can be considered as belonging to 
one kind and, hence, as identical, only in the case where they have the 
identical chemical structure, mass, and all other characteristics. This means 
that even different isotopes of one element, or atoms in different energy 
states, cannot be considered as identical. Thus, for example, the mixing of 
two different isotopes of a gas, or two portions of a gas consisting of normal 
and excited molecules, represents an irreversible process. This is particularly 
clear from the fact that a spontaneous separation of mixed gases does not 
occur, and that a certain amount of work must be done for their separation. 


EE 


174 IDEAL GASES Ch. 5 


§38. The Maxwell—Boltzmann distribution and the Boltzmann distribution in 
a uniform field of force 


In practice one often has to deal with a gas in a uniform external field of 
force. The most important example of such a field is the gravitational field. 
Up to now we have not considered the action of the gravitational field on 
the behaviour of a gas. We shall now consider an ideal gas in a uniform field 
of force. In such a field every molecule has a total energy 


E= Erans + U(X,),Z) - 


where Eyans is the kinetic energy of its translational motion, and w is the 
potential energy in the external field, depending on the position of the par- 


ticle. 
Substituting this expression for the energy into the Gibbs distribution 


(19.6) for a molecule of an ideal gas, we have 


dw = ol exp 


ee 
zh3 


KT Jowai, (38.1) 


where the integral over states is, obviously, equal to 


=f Etran t4 | dp, dpy dp; dV 
z exp kT 13 $ (38.2) 





The integration is carried out over all possible values of the variables. 
Noting that the integral over states can be written in the form 


z= feretransvkT GPx APy Pz fuer dV = 
h3 


3 
å (24) 2 feur dV, (38.3) 


we find that the Gibbs distribution, normalized to unity, for a molecule of an 
ideal gas in the presence of an external field has the form 


] 2 e-u/kT dV 
dw =|, e-p°/2mkT ap dp, dp, |E]. 38.4 
[assem aez Se-wWkT ay Ge) 


§38 MAXWELL-BOLTZMANN DISTRIBUTION 175 


The probability distribution obtained, characterizing the probability for a 
molecule to have a given momentum and to be in a given volume element is 
called the Maxwell—Boltzmann distribution. 

The first of the factors in (38.4) is the Maxwell distribution, with which 
we are familiar. It characterizes the probability distribution over the compo- 
nents of the momentum. The second factor depends only on the coordinates 
of the molecule and is determined by the form of its potential energy 
u(x,y,z) in an external field of force. It expresses the probability for a mole- 
cule to be found in a given volume dV. In the particular case where there is 
no external field the distribution of molecules over the entire volume of the 
container is uniform and the second factor reduces to the value V~!dV. 

On the basis of the theorem of multiplication of probabilities the Maxwell— 
Boltzmann distribution can be considered as the product of the probabilities 
of two independent events: the probability of a given value of the momentum 
and the probability of a given position of the molecule. The first probability 
represents the Maxwell distribution, while the second probability represents 
the Boltzmann distribution. Each of the distribution is normalized to unity. 

The fact that the two distributions are independent expresses an important, 
and at first not obvious physical proposition: the probability of a given value 
of momentum does not depend on the position of the molecule and, con- 
versely, the probability of a given position of the molecule does not depend 
on its momentum. 

Now consider in more detail the Boltzmann distribution for the particular 
case where the gas is in the field of terrestrial gravitation. We direct the 
z-axis vertically up. Then the potential energy of a gas molecule can be 
written in the form 


u=mgz. 


Since the potential energy depends only on the height, molecules are 
distributed uniformly in the plane z = const. Hence only the dependence of 
the probability distribution on the coordinate z is of interest. It has the 
form 


dwg = ———_——, (38.5) 


where the integral is taken over all possible values of z. 
Introducing the mean number of particles per cm3 at a given height instead 
of the probability distribution, one can rewrite (38.5) in the form 


176 IDEAL GASES Ghes 
dn = ng e~MSZ/KT dz , (38.6) 


where ng is the number of particles per cm3 in the plane z = 0. 

Formula (38.6) shows that the density of a gas in the gravitational field 
decreases according to an exponential law. It decreases by a factor e as the 
height increases to ô = k7/mg. This quantity can be called the characteristic 
length of the distribution of particles in the gravitational field. For hydrogen 
6 amounts to about 3 X 10° m at room temperature while for air ô is cor- 
respondingly equal to 104 m. The Maxwell velocity distribution of molecules 
with a constant temperature T holds at all heights. However, the number of 
molecules at different heights decreases according to the exponential law 
(38.6). At first sight the constancy of the temperature at all heights may seem 
to contradict the tolowing simple reasoning: if a molecule, which at a height 
Zo has a kinetic energy anwo moves to a height z, then its kinetic energy must 
decrease to a value amv — mg(z—Z,), where mg(z—Zy) is the work done 
against the force of gravity. Hence at a great height a molecule will have 
smaller velocity and kinetic energy. But, on the other hand, the temperature 
is connected with the mean square velocity by the relation (10.6). Conse- 
quently, the temperature of the gas must decrease with height. The fallacy 
of this reasoning lies in the consideration of only one molecule, without 
taking into account its collisions with other gas molecules. The Maxwell 
velocity distribution is established owing to collisions between molecules. 
In the foregoing reasoning the establishment of the Maxwell distribution is 
ignored and the “temperature of a molecule’, which has no meaning, is 
considered. In fact, those molecules which have a large velocity will pre- 
ferentially go up. Hence the Maxwell distribution will be established 
automatically at all heights. 

Let us consider certain conclusions which can be derived from the density 
distribution of the gas with height. First of all, we shall dwell on the idea of 
the weight of a gas. Imagine a container of a height 4, confining a gas. This 
gas has a certain weight. It is often said that the weight of a gas is the weight 
of all the molecules constituting it. In reality, however, this is not strictly 
true. The weight of a gas is measured by the difference between the pressures 
exerted by the gas on the bottom and the top of the container. All mole- 
cules of the gas, which are in permanent motion and which in the greater part 
of the container do not collide directly either with the bottom or the top of 
the container, take part in the production of this pressure difference. In this 
lies the difference between the weight of a gas and the weight of a body lying 
on the pan of a balance. 

Let us find the weight of a gas column of height A. We can proceed in two 


§38 MAXWELL-BOLTZMANN DISTRIBUTION 177 


ways. First, it can be determined purely formally, by writing that the weight 

of the gas column is equal to the weight of all the molecules constituting it. 

Second, it can be found by taking the difference between the pressures 

exerted by the gas on the bottom (z=0) and the top (z=h) of the container. 
In the first case we have: 


h 
P= mg Jan= mgng f e-mgz/kT dz Jes = 
0 
= Smgny sL (1 —e7msgh/kT) = SkT(ng—np) » 


where S is the area of the cross section of the container, and ng and np are 
the densities of the gas at the heights z = 0 andz =h. 
In the second case we can write that 


P= S(Py—Ph) = S(ng—np KT , 


where po and pp are the pressures at the heights z = 0 and z = h respectively. 
Thus, the calculation confirms the validity of the concept of the weight of a 
gas as measured by the difference between the pressures on the bottom and 
top of the container. 

The molecules of a gas which are in a gravitational field possess a certain 
mean potential energy which exceeds the mean energy of the gas outside the 
field of force. Hence the mean energy of a gas in a gravitational field and, 
consequently, also its heat capacity must be higher than the values which we 
have calculated earlier. Let us find the extra heat capacity of the gas in a 
gravitational field. For this we shall calculate the mean potential energy of a 
gas molecule in a gravitational field. By definition it is equal to 


u = mgz = mg f zdwp r (38.7) 


where dwg is the probability that the molecule will be found at a height be- 
tween z and z + dz, given by formula (38.5). Substituting the expression for 
dwg into (38.7), we have 


-mhz/kT 
u = mgz = mg faer Toa iat Kdz (38.8) 
fenmsz/kT dz 


In calculating the integrals contained in (38.8) it is essential to know the 








178 IDEAL GASES Ch. 5 


height of the gas column. Consider first of all the case of an infinitely high 
gas column or, more precisely, a column confined in a container whose 
height is considerably larger than the characteristic height 6. Then the range 
of integration with respect to z extends from z = 0 to z > %. The calculation 
of the simple integrals (see Appendix IV) gives 


u=kT. 


The mean potential energy of one molecule in the infinitely high gas column 
turns out to be proportional to the absolute temperature. The mean potential 
energy of a gram-mole of the gas in the infinitely high column is equal to 


U=Nou=NokT. 


Whence we find for the extra heat capacity at a constant volume per gram- 
mole due to the potential energy 


pot _ 
Chet = Nok . 


The heat capacity obtained is thus comparable with that due to the kinetic 
energy of the molecules. 

We come to a completely different conclusion in the case of a gas column 
confined in a container of a height h <6. In this case the mean potential 
energy of the molecules is equal to 


ie e~mgz/kKT zdz 


=i = =e 
h e-mgz/kT q 

So e z 

Since in the integration range mgz/kT =z/65 <1, one can expand the inte- 

grand in a series and confine oneself to the first term of the expansion. This 


gives 
u=4mgh , U=Na= 4Nmgh . 


In this approximation the mean potential energy of the gas column does 
not depend on the temperature at all. The corresponding contribution to the 
heat capacity is zero. In the next higher order approximation one can obtain 
a heat capacity which is very small in comparison with that due to the 
presence of kinetic energy. Thus, in practical cases of interest the contribu- 
tion of the potential energy to the heat capacity of a gas can be disregarded. 


§39 HEAT CAPACITY OF DIATOMIC MOLECULES 179 


Finally, consider the height distribution of molecules possessing different 
masses. From the form of the distribution (38.5) it is clear that the larger the 
mass of the molecule the more rapidly the number of corresponding molecules 
decreases with height. If there are equal numbers of molecules with masses 
my and m, at the height z = 0, then the ratio of the numbers of the molecules 
of the two kinds at a height / is equal to 


ny (my —my w) 
n exp| kT ; 

If the simple laws of the equilibrium distribution of density were applicable 
to the Earth atmosphere, then a sharp change of the composition of the at- 
mosphere with height would be observed. In reality, however, measurements 
performed on the composition do not confirm this conclusion. It is also well 
known that the temperature decreases with height, which is also in complete 
contradiction with the requirement of the constancy of the temperature in an 
equilibrium gas column. These, and also a number of other facts show that the 
atmosphere is not in a state of statistical equilibrium. 


§39. The calculation of the heat capacity of diatomic molecules by means 
of classical statistics. The law of equipartition of energy over degrees of 
freedom 


Most substances in the gaseous state exist in the form of molecules. One 
very often has to deal with diatomic gases. Examples of such gases are H3, Op, 
N32, HCI and CO. Our immediate problem is the generalization of the results 
previously obtained to the cases of polyatomic gases and, in particular, to 
diatomic gases. 

The basic difference between diatomic and polyatomic gases and mona- 
tomic gases is the presence of rotational and vibrational degrees of freedom 
in the former. We shall begin the treatment with the simplest case, diatomic 
gases. We shall assume at first that the molecule represents a system obeying 
the laws of classical mechanics. Since we shall not be interested in the internal 
motion of electrons in the atom, we shall replace each atom by a material 
point having no extension. Two material points bound into a molecule can be 
likened to a miniature dumb-bell at the ends of which there are two infinitely 
small spheres with masses m, and my (different atomic masses). Since the 
link between the atoms in the molecule is not absolutely rigid, the following 
forms of motion are possible: translational motion (three degrees of freedom), 





pT 


180 IDEAL GASES hers 


rotation about two axes perpendicular to the axis connecting the two atoms 
(two degrees of freedom), and vibration of the atoms along the line connect- 
ing them. We consider molecules as material points having infinitesimal 
dimensions, so that it makes no sense to speak of the rotation about the axis 
of the molecule. 

The energy of the molecule can be written in the form 


E= Etrand + Erot * Evib > (39.1) 
where Erans» Erot and €yj, are respectively the energies of the translational, 
rotational and vibrational motion. 

The state of a molecule formed by two atoms bound to each other is 
determined by defining six coordinates and six momenta. The coordinates are: 
the x,y,z coordinates determining the position of the centre of mass in space; 
@ and y determining the position of the axis of the molecule in space; the 
q-coordinate determining the departure of the atoms from the equilibrium 
distance. The momenta corresponding to these coordinates are 


Px, Py, Pz» M,=I1w,, M3 =I, Pq: 


Here / denotes the moment of inertia, and w denotes the angular velocity of 
rotation of the molecule. The corresponding phase space has 2-(3-2) = 12 


dimensions. 
We apply the Gibbs distribution to the molecule, considering it as a quasi- 


closed subsystem. Obviously, we have 


dw=—L exp (39.2) 


| transl Erot €vib 
-- ——_ ——— dl 
hêz 


kT 


An element of the phase space of a diatomic molecule can be written in the 
form 


dP = dP yang dProt I yim > (39.3) 
where the following notations are introduced: 
dP yans = 4p, dpy dp, dx dy dz; 


dr, 


rot 


= dM, dM) sin@ dé dọ ; (39.4) 


dyp = dpa d4 - 


§39 HEAT CAPACITY OF DIATOMIC MOLECULES 181 


The first factor in the expression of the phase volume corresponds to the 
three degrees of freedom of the translational motion, the second factor cor- 
responds to the two degrees of freedom of the rotational motion, and the 
third factor corresponds to the vibrational motion of the molecule. Thus, 


1 Etrans! + Erot + €vib 
O exp| m oF = |dP ana Wot I'v - (39-5) 





We see that the Gibbs distribution resolves into three independent factors, 
corresponding to the translational, rotational and vibrational motions. Each 
of these forms of motion is independent of the other two. Hence the trans- 
lational motion, rotation and vibration can be considered independently of 
each other. 

The translational motion of diatomic molecules differs in no way from 
that of monatomic molecules, since the translational motion reduces to the 
motion of the centre of mass of the system. 

Now let us consider the rotational motion of a diatomic molecule. 

The rotational energy of a diatomic molecule for a given distance between 
the atoms has the form 


2 2 2 2 
lwi 1095 LR 
2 SP) hae 


G ap nar = (39.6) 


rot — 2 


where œw; and w are the angular velocities of the molecule; and M} and Mj 
are the angular momenta. The moment of inertia / of the molecule is equal to 


mm 
=—— g, - (39.7) 
m, +m 


where a is the distance between the atoms, and my, and mp are their masses. 

The probability for the molecule to have values of angular momenta lying 
between M, and M,+dM,, M and M,+dM), and to be oriented in space 
in such a way that its axis forms with the coordinate axes angles between 0, 
0+d0 and y, y+dy, has the form 


22 
MIM; 


aa) dM, dM, sin dé dy. (39.8) 


dw,o¢ = const - exp| - 


Since wevare not interested in the orientation of the molecule in space, it is 
more convenient to pass over from the expression (39.8) to the expression 


182 IDEAL GASES Gh- 5 


for the probability that the angular momentum has a given value for any 
orientation of the axis of the molecule in space. Integrating expression 
(39.8) with respect to the angles 6 and y, we obtain 
M?+M;, 
dWyrot = const - exp QT dM, dM, A (39.9) 


The constant in the expression (39.9) can be found from the normalization 


condition. 
Instead of the angular momenta with respect to the axes, one can introduce 


the more customary quantities, the angular velocities. Then the probability 
for the molecule to have angular velocity components lying between w], 
w + dw], Wo, W + du, is given by 


Io? +05) 
dw, = const - exp| — = ] dw; dwz , (39.10) 


where the constant is again determined from the normalization condition. 

Integrating formula (39.10) over all values of the angular velocity com- 
ponents (the integration range can be extended up to + œ on the same grounds 
as for the translational velocity components), we have 


Io? +w 2) 
1 = const I E ar 2 dateien dwz , (39.11) 
whence 
const = al 39.12 
ZKT ` (ON) 
Hence we have finally 
I Iw? tw) 
1 2 
dwot = 5 re A dw; dw. (39.13) 


Proceeding from the condition (39.10), we find the mean rotational energy 


Erot = Fwy +5) - (39.14) 


§39 HEAT CAPACITY OF DIATOMIC MOLECULES 183 


For the mean value of the square of the angular velocity component we have 





2) Sel P go? glo t/2kT -1w3/2kT TUAIR g 
1 OTT Soie dw; fe dwz =F . (39.15) 


co -00 


From consideration of the symmetry it is clear that 


w? = w? (39.16) 
hence the mean value of the rotational energy is equal to 
Got = AT . (39.17) 


To each degree of freedom of the rotational motion there corresponds an 
energy equal to $kT. 

Passing on to the vibrational motion of a diatomic molecule, we note first 
of all that in the first approximation one can confine oneself to small vibra- 
tions about the equilibrium position, i.e. the equilibrium distance between 
the two atoms. In this case the vibrational energy of a diatomic molecule can 
be written in the form 


. 2 
ug? uw?q? Po uw?q? 
Evib = 2 T 2 = + 





2u ay 0 (39.18) 


where q is the departure of the atoms from the equilibrium position, x is the 
reduced mass, and w is the vibrational frequency connected with the constant 
of the quasi-elastic force x by the relation w= (k/)?. The momentum of the 
vibrating system is pg = uq. 

The probability for the atoms in the vibrating molecule to be in a position 
q and to have a momentum Pq has the form 


za Pq uw2q2 | dq 44 
ou exp| Seal pel? i 2kT h 


The partition function z in the expression (39.19) is found from the normali- 
zation condition: 





(39.19) 


oo 2 es) 
al Pa ] ese? 
eh Ss exp|- mut | Pa f exp] aera? dq . (39.20) 


Since the integrands decrease very rapidly with increasing arguments and the 
integrals converge rapidly, we have extended the integration range to infinity. 


184 IDEAL GASES Ch. 5 
A simple calculation gives 


2nkT 
= ——— 2 
zos - (39.21) 


Thus, we have finally 


5 
p7 eae 
dwy = (ser) exp | oar ex | ee i dpg dq. (39.22) 


We find the mean energy of the vibrational motion: 





2 cd 
P 22 
EA K 
vb a a (39.23) 


Calculating the means, we have 


=z 7 LE 

2- Pm He (39.24) 
k 

Hh eae) 


Substituting the mean values from (39.24) into (39.23), we find 
E&i = AT - (39.25) 


To one vibrational degree of freedom there corresponds on the average an 
energy two times larger than that corresponding to one degree of freedom 
of the translational or rotational motion. The meaning of this will become 
clear if one recalls that for the vibrational motion the mean (over one 
period) kinetic energy of the system is equal to the mean potential energy. 
The energy of vibrational motion consists of two components having the 
same structure: quadratic expressions with respect to the independent vari- 
ables p, and q. For other degrees of freedom the energy is expressed by one 
quadratic term with respect to the independent variable for each degree of 
freedom. The averaging of each quadratic term in the energy leads to the 
mean energy KT +4kT = KT. 

In the general case it can be said that each quadratic term entering into 
the energy of the system has a mean value equal to 44T. We have convinced 
ourselves of this by the example of monatomic and diatomic molecules. All 
our reasoning can be applied without any special difficulty to the case of 
polyatomic molecules. 





Í 
H1 
| 
i 
f 
f 
f 


O 


§39 HEAT CAPACITY OF DIATOMIC MOLECULES 185 


Consider, for instance, triatomic molecules. A triatomic molecule can have 
a structure similar to that of the CO, molecule, or the H2O and SO% mole- 
cules (fig. III.19). In the first case all the atoms are distributed along a line, 
and the molecule is called a linear molecule. Molecules of the second type are 
called non-linear molecules. In the case of a linear triatomic molecule, having 
nine degrees of freedom, the following forms of motion are possible: the 
translational motion of the molecule as a whole (three degrees of freedom), 
the rotation about two axes perpendicular to the axis of the molecule (two 
degrees of freedom), and the vibrational motion (four degrees of freedom). 
Possible types of vibrational motion for a linear molecule are shown in 
fig. 111.19. The directions of motion in a given phase of normal vibrations 
are shown by arrows, while v], v2 and v3 denote vibrational frequencies. 
Two vibrations with frequency vz taking place independently of each other 
in two perpendicular planes are possible. The mean energy of a linear mole- 
cule is made up of the mean energy of the translational, rotational and 
vibrational motions. Each of these forms of motion, as well as different 
normal vibrations, are independent of each other. Hence to each of these 
forms of motion taken separately we can apply the reasoning of the preceding 
paragraphs. To each degree of freedom of the translational and rotational 
motion there corresponds a mean energy kT, while to each vibrational 
degree of freedom there corresponds an energy kT. Thus, the mean energy of 
a linear triatomic molecule is equal to 


e= 3 PTs 4kT=6.5kT. 
2 2 
For a non-linear molecule the mean energy turns out to be different. For 
such a molecule the following forms of motion are possible: the translational 
motion of the molecule as a whole (three degrees of freedom), the rotation 


H20 
COTAN SO, 


) 


vy — m 


max C—O» 
V3 


v4 


Mey 


Yop 


Fig. 111.19 


=a 


— 


186 IDEAL GASES Ch. 5 


about three mutually perpendicular axes (three degrees of freedom), and the 
vibrations (three degrees of freedom). Possible normal vibrations are shown 
in fig. 111.19. Everything else said about linear molecules holds also for non- 
linear triatomic molecules. Thus, the mean energy of a non-linear triatomic 
molecule is equal to 


SNE 
Gass +3 + 3kKT = 6kT. 


The problem of the mean energy of a polyatomic molecule can be con- 
sidered in an analogous way. If the molecule contains 7 atoms, then out of 
3n degrees of freedom there are always three translational degrees of freedom, 
three or two (in the case of a linear molecule) rotational degrees of freedom 
and respectively 3n — 6 or 3n — 5 vibrational degrees of freedom. Each of the 
degrees of freedom gives the corresponding contribution to the mean energy, 
the same as in the case of diatomic or triatomic molecules. Thus, the mean 
energy of a non-linear m-atomic molecule is equal to 


e340 + a + (3n—6) kT, 


and that of a linear n-atomic molecule is 


z= 34T, +24 + Gn- 5) kT. 


In the general case one can write 


where r is the number of quadratic terms entering into the expression for the 
energy. Thus it turns out that all degrees of freedom of a molecule are equiv- 
alent: each quadratic term in the energy gives a contribution of 3kT to the 
mean energy of the molecule (the law of equipartition over degrees of free- 
dom). The law of equipartition is a very general law. In deriving it we have 
not made any special assumptions, but have only assumed that the laws of 
statistical physics are valid and that the motion of the molecule obeys the 
laws of classical mechanics. 

It should in addition be noted that the formulation of this important law 
is rather inappropriate, since it underlines the difference between vibrational 
and other degrees of freedom. 


§39 HEAT CAPACITY OF DIATOMIC MOLECULES 187 


Since, however, the formulation presented is generally adopted, we shall 
make use of it in what follows. Moreover, we shall often for brevity call the 
number of quadratic terms in the energy the number of degrees of freedom. 

Knowing the mean energy of a gas molecule, and taking into account that 
all molecules in an ideal gas are completely identical and equivalent, we can 
easily find the mean energy of the gas as a whole. If the gas has NV molecules 
then the mean energy of the gas is equal to 


E=Ne=N -4rkT. (39.26) 


The heat capacity of a gas at constant volume Cy is equal to 


a (22 ay 
eh ae 4Nkr . (39.27) 


In particular, for one gram-mole of a gas the heat capacity at constant 
volume is equal to 


Cy =2rR . (39.28) 
Correspondingly, the heat capacity at constant pressure is equal to 
Cp =Cy +R =20+2)R . (39.29) 


Thus, the heat capacity of ideal gases turns out to be independent of the 
temperature, and is determined solely by the structure of the molecule. For 
monatomic gases the heat capacity at constant volume, calculated according 
to formula (39.29), is equal to Cy = ẸR = 12.48 J mol"! K-!. 

For comparison with experiment, table 3 in §37 presents the measured 
values of the heat capacities of certain monatomic gases at constant volume 
and different temperatures. From table 3 it is seen that the theoretical pre- 
dictions are well justified experimentally: the heat capacity of monatomic 
gases is constant over a wide tenipérature interval and has almost exactly the 
theoretical value. 

For diatomic gases the situation is quite different. According to theoret- 
ical predictions, the heat capacity of diatomic gases should be equal to 


Cy =3R = 29.113 mol! K-! . (39.30) 


However, experiment shows that diatomic gases do not in reality possess 





188 IDEAL GASES Chess) 


such a large heat capacity. Moreover, it turns out that the heat capacity of 
diatomic gases depends on the temperature. This dependence is illustrated by 
fig. 111.20. The general character of the dependence of the heat capacity on 
the temperature can be defined in the following way. Although at very high 
temperatures the heat capacity does not reach the theoretical value (39.30), 
it tends to it. As the temperature decreases the heat capacity also decreases 
and tends to the value 


Cy =R = 20.8 J mol! K-! . (39.31) 


This value would be possessed by a diatomic molecule with an absolutely 
rigid link between the atoms, for which any vibrational motion is impossible. 
This vanishing of vibrational motion is completely inexplicable from the point 
of view of classical mechanics. From this standpoint, as we have stressed 
more than once, all degrees of freedom are completely equivalent. The 
vanishing of small vibrations as the temperature drops is in sharp contradic- 
tion with the basic propositions of classical mechanics. An even more striking 
example of such a contradiction is provided by the behaviour of hydrogen at 


ans / Nk 





24 


2.2 


20 


18 





16 








= = 
(0) 50 100 150 200 250 300 
T(K) 


Fig. II1.20 


§39 HEAT CAPACITY OF DIATOMIC MOLECULES 189 








C/Nk 
5 —_ : 
24 
22 
20 
18 
16 
L all. 
o 50 100 150 200 250 300 
T(K) 
Fig. 111.21 


low temperatures. Namely, as is seen from fig. III.21, as the temperature 
decreases so does the heat capacity of hydrogen down to the value $R, equal 
to the value of the heat capacity of a monatomic gas. Thus, at low temper- 
atures not only the vibrational motion but also the rotational motion vanishes 
in hydrogen molecules. A diatomic molecule can then perform only a trans- 
lational motion. 

From the point of view of our usual notions it is quite incomprehensible 
why an extended body, such as a diatomic molecule, can lose the ability to 
rotate. The contradiction between this fact and obvious ideas based on the 
laws of classical mechanics is even more evident than in the case of the vanish- 
ing of vibrations. 

All that has been said about diatomic molecules holds also for polyatomic 
molecules. The part of the energy corresponding to vibrational degrees of 
freedom is always considerably smaller than that to be expected from the law 
of equipartition. For example, in the case of the linear moJecule CO, the 
vibrational heat capacity should be equal to 4R ~ 32 J mol! K~-!. In fact, it 
amounts to about 3.2 J mol~! K~! at room temperature, and increases up to 
a value of 24 J mol“! K-! at very high temperatures. Analogously, CH4 mole- 
cules, possessing nine vibrational degrees of freedom to which, according to 
the law of equipartition, there should correspond a heat capacity of 
72 J mol“! K-~!, have a vibrational heat capacity which does not exceed 
13.8 J mol-! K-!. 


Thus, experiment points to the inapplicability of the law of eauipartition 


190 IDEAL GASES Ch. 5 


over degrees of freedom. But, as we have stressed, this law is based on only 
two assumptions: the assumption of the applicability of general statistical 
laws to simple molecular systems and the assumption of the applicability of 
the laws of classical mechanics to the description of the motion of individual 
molecules. Since the validity of the first assumption is beyond any doubt, the 
disagreement with experiment to which the law of equipartition leads shows 
that the second assumption is incorrect. In reality the motions of individual 
molecules obey the laws of quantum mechanics. The statistics of molecular 
systems moving according to the laws of quantum mechanics will be discussed 
below. 


§40. The thermodynamic functions of a system which can be in two quantum 
states 


Before going on to the consideration of more complex diatomic and poly- 
atomic molecules, we have to consider the general aspect of the properties of 
a system which can be in two quantum states. We shall not specify the nature 
of these quantum levels. In the following paragraphs we shall see that these 
can be the quantum levels of the energy of rotational or vibrational motion. 
Sometimes they may also have a different nature. 

Let us find the partition function of such a system. By definition, 


z= Dye GT ge), (40.1) 


where e; are the quantum energy levels and g(e;) is the number of states of 
the particle whose energy is equal to e¢;. If g is different from unity, so that 
to one value of the energy of the system there correspond several different 
states, then the latter are called degenerate states, and their number is called 
the statistical weight of the energy level ej. In our case, where, for simplicity, 
we confine ourselves to two energy levels, the index / runs over the values 
0, 1. We denote Eleg) by go» and g(€;) by gı. Then 


= 2 = il Ger 
z= 8, R ColkT + g, A GAT mg é € /kT (: = A (e1 Sa (40.2) 
0 
If the energy is expressed in thermal units of kT, then one can write that 


€j — €p = kT » (40.3) 


§40 SYSTEM HAVING TWO QUANTUM STATES 191 


where Te is a certain temperature corresponding to the difference e€] — Eq: 
By means of (40.3) the expression (40.2) can be written in the form 


a Gm 
Fo ga aie (: F e EA (40.4) 
0 


From the expressions (40.2) and (40.4) we see that, if the energy dif- 
ference between the excited and ground state is so large that at a temperature 
T the inequality €} — eg >kT or T, >T holds, then the second term in 
(40.4) can be disregarded, so that 


Zi=185 elk 


(40.5) 
Physically this means that at a given temperature the probability for the 
system to get into an excited state with energy €y is very small. The temper- 
ature is too low for thermal excitation to bring the system into an upper 
energy state with an appreciable probability. If, however, only one term 
enters into the partition function, so that the system is, with a probability 
equal to unity, in a state with an energy Eg, then its energy is exactly equal 
to eg. This is confirmed by a direct calculation. 

The partition function of a system consisting of M independent equal 
particles is 


een -eo /kT;N 
Aa E CAA 
According to (21.3), the energy of the system is equal to 
ð lnZ 
= PY SS 
E= KT aT Neg. 
The heat capacity of the system at constant volume is 


Cy= (ae To (40.6) 


Thus, we see that at T <T, the presence of the second level does not 
affect the thermodynamic properties of the system at all, and it behaves as a 
system with constant energy. The heat capacity of the system at a sufficiently 
low temperature is equal to zero. 





192 IDEAL GASES Ch. 5 


The behaviour, as the temperature rises, of the heat capacity of the system 
with two energy levels is of interest. If the inequality T, > T is not fulfilled, 
then one has to retain two terms in the partition function, writing 


~e,/k E 7 
z=g,e cok? (: +% e-ackT) i 
£0 


In this case 


l ENT 8} N 
Ga ef o/kT)N ( ce) , 
N! (Eo ) Zo 


The energy of the system is 





Ng, Ae e74e/kT 
E=k?? ad = Neg + TRACT (40.7) 
gl 161/89) eo AT] 


Finally, the heat capacity at a constant volume is 





0.80 


0.20 








T/Te 


Fig. [11.22 


§40 SYSTEM HAVING TWO QUANTUM STATES 193 


a - (22) - (2) alee A ezAac/kT r 
V oT V - Eo (er [1#(3/g9) eS *T]2 


T\2 -ToT 
= Nk ail A EIEE (40.8) 
£0 [1+(81/89) € 5 ] 


The form of the heat capacity is shown in fig. I11.22. From fig. I.22 it is 
seen that this form is unusual: at 7 = O the heat capacity is zero, in agreement 
with what was said earlier. At higher temperatures the heat capacity increases 
and has a characteristic maximum. As the temperature rises further the heat 
capacity again reduces to zero. This latter fact represents a characteristic 
feature of a system with two levels. The reason why Cy reduces to zero is 
understandable from formula (40.7). At a very high temperature the energy 
of the system is equal to 











Ng, €, —€ 
E ~ Ne, oe = t 
Eo | +8180 





= const (40.9) 


and does not depend on temperature. Physically this means that at T> T, the 
thermal excitation is so large that the system may with equal ease be in the 
ground state or in an excited state. The probability for the system to be in an 
excited state is comparable to the probability of being in the ground state. 

If the system had other excited levels, ranging up to very large energies, 
this would not hold. Even at a high temperature there would be energy levels 
which the system would only have a small probability of occupying. Hence 
the mean energy of such a system would not be expressed by a formula of 
the type of (40.9) and, at a high temperature, would vary with temperature. 

Correspondingly, the heat capacity of the system would not reduce to 
zero at a high temperature. 

The form of the heat capacity, with a maximum and the reduction to zero 
on the low and high temperatures sides, is typical of a system with levels 
lying in a finite energy interval. The presence of two levels simplifies the 
calculation, but it is not essential. A similar form of the heat capacity also 
occurs in systems with several levels. It is of importance only that they lie 
sufficiently close to each other, so that the temperature at which the 
condition kT > eù —e,, is fulfilled may be reached. A typical example of 
atoms with two closely lying levels are atoms of the halogens and the alkali 
metals. In the case of halogens the lowest level is four-fold degenerate, gy =4, 
and the excited level lying closest to it is two-fold degenerate, g} = 2. The 








194 IDEAL GASES Ch. 5 


distance between the levels is Te = 582.7 K for fluorine, T, = 1299 K for 
chlorine, and T, =5275K for bromine. The next energy level lies much 
higher: the corresponding temperature amounts to several tens of thousands 
of degrees (for example, for bromine it amounts to about 88 X 103 K) and 
gives practically no contribution to the heat capacity. 


§41. Diatomic molecules 


The simplest molecules are diatomic molecules, representing a stable 
combination of two identical or dissimilar atoms. We cannot here discuss in 
detail the problem of the nature of the forces leading to the formation of 
molecules from free atoms nor describe in detail the motion of atoms in 
molecules. We shall confine ourselves to the most superficial characteristics 
of molecules, presenting only those data which we shall need in what follows. 
(Ch. 10 of Part V is devoted to the theory of molecules.) 

Fig. III.23 shows a typical curve representing the energy of interaction of 
the electron shells of atoms as a function of the distance between them. The 
potential energy of interaction has a minimum at a certain point denoted by 
ro- To the right of it at large distances the tangent to the curve and, conse- 
quently, the force of interaction also is positive. This means that the atoms 





! 
i 


Fig. 111.23 





§41 DIATOMIC MOLECULES 195 


attract each other. To the left of the point rọ, in the region where the elec- 
tron shells are overlapping, a strong repulsion between the atoms arises. 

Thus there corresponds to the position of stable equilibrium in a molecule 
a definite distance between the nuclei of the atoms, which can be called the 
diameter of the molecule (see table 4 below). The energy U(r) of the elec- 
tron shells has a minimum value. If the distance between the nuclei changes 
by a small value x, then the energy of the molecule becomes equal to 
U(r) tx). For small values of x it can be expanded in a series in powers of x 
and one can restrict oneself to the first terms of the expansion: 


au a2U 
U(r, tx) = U| + (2) EA 
otx) ~= Uro) a ws mcr 


2) 
San S 


=ro (41.1) 


n= 


= Uro) +4Kx2 


(at the point of the minimum the first derivative is equal to zero, and the 
second derivative is positive). Formula (41.1) shows that when the atoms 
depart from the equilibrium position they are acted upon by a quasi-elastic 
force bringing them back into the equilibrium position. From this it is clear 
that in a molecule, in addition to the motion of electrons in atomic shells, 
the vibration of the atoms about the equilibrium position is also possible. 
Moreover, a molecule as a whole can rotate about two axes perpendicular 
to the straight line connecting the nuclei. 

Thus, the energy of a molecule can be considered as made up of the energy 
of translational motion of the molecule as a whole in space, the energy of 
motion of the electrons, the vibrational energy and the energy of rotation of 
the molecule. The translational motion of a diatomic molecule differs in no 
way from that of a monatomic molecule. Hence we shall be interested only 
in the internal motion of a diatomic molecule. Its internal energy can be 
written in the form 


Eint = €el + Evib + Erot > (41.2) 


where €, is the energy of motion of the electrons, Eyjp is the energy of vibra- 
tions, and €,,; is the energy of rotation. The internal motion of the molecule 
turns out to be quantized. The energies Eej, €,j, and Erot take on a discrete 
series of values. The spacing Ae, between neighbouring energy levels of the 
electrons in the molecule turns out to be much larger than the spacing Ae, 
between neighbouring energy levels of the vibrational motion. In its turn the 
spacing Ae,;, between neighbouring energy levels of the vibrational motion 
is very large in comparison with the spacing A€, ot between neighbouring 
energy levels of the rotational motion. Thus, 


196 IDEAL GASES Ch. $ 


© 
" 


ANWSL=anwh INWA -NWA 








| 


| 





| 
| 





(s 
i 








V=2 


2NU hanu baw à | 


v=0,ł 


iS 
i 


Fig. 111.24 


Meg > A€yin > A€rot - (41.3) 


Hence the energy levels of a molecule are distributed in the way shown 
schematically in fig. III.24. In this drawing the vibrational and rotational 
levels pertaining to two electron levels, / and /', are presented schematically. 
The spacing between the latter is large and is represented by dashed-dotted 
lines to show that there is not sufficient room for it in the scale of the 
drawing. Bold lines correspond to vibrational states with quantum numbers 
v= 0, 1, 2. Fine lines correspond to different rotational levels with quantum 
numbers J = 1, 2, 3, 4. Since the motion of electrons is much faster than that 
of heavy nuclei (for the vibrations and rotation of the molecule as a:whole), 
it can be assumed in the first approximation that the motion of the nuclei 
does not affect that of the electrons. 

Further, if the amplitude of vibration of the nuclei of the molecule is suf- 
ficiently small, the effect of the vibrational motion on the rotation can be 
disregarded. For a small vibrational amplitude the change in the distances 
between the nuclei is so small that the corresponding change in the moment 


§41 DIATOMIC MOLECULES 197 


of inertia of the molecule is very small and can be ignored. The rotation will 
proceed with a constant moment of inertia, as though there were no vibra- 
tions. Thus, to a first approximation all three forms of motion in a molecule 
can be considered to be independent of each other. 

It should be noted that the accuracy of contemporary methods of meas- 
urement is such that for many purposes the calculations based on the con- 
cept of the independent rotational and vibrational motion turn out to be in- 
sufficiently accurate. In contemporary theory one has to take into account 
the change in the moment of inertia of the molecule due to its vibration. 

The spacing between neighbouring energy levels of the electron motion, 
as between energy levels in atoms, is of the order of several electron-volts, 
which corresponds to a temperature of several thousand degrees. In order to 
bring a molecule from one electronic level to another the corresponding 
energy must be imparted to it. This is only possible at very high temperatures 
(or for non-thermal effects on the molecule, for example, when it is irra- 
diated by light, bombarded by fast electrons, etc.). As a rule, however, it can 
be assumed that these sources of excitation of the electronic motion are 
absent and that molecules are in the lowest electron energy level. In what 
follows we shall confine ourselves to the study of this case. Thus, in con- 
sidering the thermal motion of molecules it is possible to disregard eleviron 
energy levels. 

We shall consider now the vibrational motion of a diatomic molecule. The 
oscillations of the two nuclei about the equilibrium distance can be reduced 
to the oscillatory motion of one material point with a reduced mass 
= mymp/(m +m). Such a material point represents a linear oscillator, 
considered in §1. For a sufficiently small vibration amplitude it can be 
considered to be a harmonic oscillator. The energy of a harmonic oscillator 
takes on a discrete series of values given by formula (1.18): 


vip = hV(NtY) , 
where the quantum number n takes on a series of integer values: n= 
0, 1, 2,...; v is the classical frequency connected with the constant of the 
quasi-elastic force k and the mass of the oscillator by the usual relation 


v= (ky) )2/2m ; 


All levels of the oscillator are non-degenerate, so that to each value of the 
quantum number n there corresponds a definite energy €,;,. The energy 
difference between neighbouring levels of the vibrational motion is equal to 


ea a ee 


198 IDEAL GASES Ch. 5 
AEyip = h(t) — hv(n— 1+3) = hv 


and does not depend on the quantum number n; the energy levels are dis- 
tributed at equal distances from each other. 

According to the Bohr frequency rule, when a system passes over from 
one energy level to another a photon with energy hv is emitted or absorbed. 
In quantum mechanics it turns out that the change in the quantum number 
n obeys the so-called selection rule: 


An = +żł]. 


Measuring the frequencies of the light absorbed or emitted by the mole- 
cules, one can determine the natural frequency v of the molecule and find the 
constant of the quasi-elastic force x and the energy difference Aey. The 
values of these quantities for several molecules are presented in table 4. The 
radiation emitted or absorbed by molecules as their vibrational state is 
changed (for a fixed value of the electron energy) lies in the infrared spec- 
tral region *, as a rule between 100 and 4000 em7!, 


Table 4 
Basic quantities characterizing the properties of diatomic molecules 











Molecule Distance Moment of Vibrational Rotational Constant of Dissociation 
between inertia J frequency constant the quasi- energy D 
the atoms (107° g cm?) vje (em?) BI elastic force (eV) 

r © 8n2fe (10% dyne cm) 
Gor cm) (emma) 

H2 0.74 0.46 4276 59.35 5.1 4.48 

N2 1.10 13.84 2360 2.00 22.2 7.38 

02 1.21 19.13 1580 1.45 11.3 5.08 

Cl, 1.99 113.5 S65 0.24 3.21 2.47 

HCl 1.27 2.67 2989 10.6 8.65 4.40 

co 1.13 14.37 2169 1.92 18.6 9.61 

NO 1.15 16.43 1906 1.68 15.4 5.29 





The formulae given for the energy of a vibrating molecule are valid only 


in the approximation of small vibrations. For a high excitation (for example, 


* It should be noted that this does not refer to symmetric molecules of the type of 


H2 and Oo, for which there are no such transitions. For these molecules Aeyjp is deter- 
mined from transitions with a simultaneous change of electron states. 


§41 DIATOMIC MOLECULES 199 


at a high temperature) their amplitude becomes large, and one must take 
into account anharmonic terms in the potential energy. 

We shall now consider the rotational motion of a diatomic molecule. If the 
change in the moment of inertia of the molecule due to vibrations is dis- 
regarded, then the molecule can be considered as a rigid rotator with a 
moment of inertia 1=m;mr (m; +m?) rotating about the centre of mass. 
As is shown in quantum mechanics (Part V, §81), the energy of a rotating 


body is expressed by the formula 


2 
erot = 5 JUL) = RBIS) , (41.4) 
81-1 


where J is a quantum number taking on integer values: J = 0, 1, 2,..., and 
B=h/8n2/ is a constant called the rotational constant. It turns out that the 
states of a rotator corresponding to given rotational energy are (2J+1)-fold 
degenerate (see §30 and §81 of Part V). 

When the quantum states of the molecule change, the quantum number 
changes by an amount AJ =+1. The spacing between neighbouring rotational 
energy levels is equal to 


272 
8721 


Ae 





(J+1) . 


rot 


By observing the radiation in the transition between rotational levels, one can 
determine the value of Aeé,,, and, consequently, the moment of inertia / of 
the moiecule. The value of these quantities for some molecules can be cal- 
culated by means of table 4. Substituting into the expressions for €,,, and 
Eyib the values of the constants given in the fourth and fifth columns of the 
table, we verify the validity of the assumption that Acyip > A€rot (AErot is 
smaller than Aep by a factor of 800—1000). It should be noted that in 
practice one seldom manages to observe the transitions of a molecule be- 
tween different rotational levels for an unchanged electron state and an un- 
changed vibrational state, since Aeé,,; is so small that the corresponding 
frequencies v= Ae,,,/h lie in the far infrared spectral region, where the 
accuracy of measurements is poor. The emission spectra and absorption 
spectra of molecules in the visible spectra region are those most often ob- 
served. These spectra arise when the vibrational, rotational and electronic 
state of a molecule are changed simultaneously. The emission (or absorption) 
spectrum has in this case the character of a group of close spectral lines 
merging in a low-resolution spectroscope into continuous bands (molecular 


200 IDEAL GASES Ch. 5 


band spectrum). The origin of the bands is easily understood from fig. 11.24. 
Let, for example, a transition take place from upper levels to the lowest 
level. The fundamental frequency is emitted in the transition from the level 
2, v= 0, J= 1. Frequencies close to the fundamental frequency are emitted 
in the transition from the levels 2, v= 1, J= 0; v= 2, J = 0 and so on. Thus, 
when the vibrational, rotational and electronic states of a molecule are 
changed simultaneously a number of frequencies lying close to each other 
are emitted (since the inequality (41.3) is fulfilled). The totality of spec- 
troscopic data has made it possible to establish the position of energy levels 
for a very large number of diatomic molecules. 


§42. Thermodynamic functions of diatomic gases 


We can now pass on to the consideration of the heat capacities and 
thermodynamic functions of diatomic gases, the calculation of which was 
insuperably difficult for classical statistics. The method of calculation of the 
thermodynamic functions for diatomic gases in no way differs from that 
which we have already considered for monatomic gases. Since the molecules 
are identical, independent particles, the partition function for an entire gas 
containing V molecules can be written in the form 


z-e. (42.1) 


We have to find the partition function for one molecule. The energy of a 
molecule can be divided into the energy of its motion as a whole in space 
and the energy of its internal motion: 


E= transl + Eint - 


Since these two forms of motion are independent, the number of states of the 
system corresponding to an energy € is resolved into the number of states 
corresponding to the energy of translational motion and the number corre- 
sponding to the internal motion: 


OQ = YEtpang) (Ent) - 


Accordingly, the partition function can be resolved into the product of two 
factors: 


§42 THERMODYNAMIC FUNCTIONS OF DIATOMIC GASES 201 


z= DD; exp (—Etrang/KT) (Etang) ` 25 exp (—€int/AT) Q(Eint) = 


= Z transl Zint o (42.2) 
where Ztrangı is the partition function associated with the translational motion 
of the molecule as a whole, and Zint is the partition function for the internal 
motion. 

The partition function for the translational motion of a diatomic molecule 
is no different from that of a monatomic molecule, since it moves in space as 
a material point with a mass m= my, +p located at the centre of mass of 
the molecule. Hence for the partition function of translational motion one 
can write that 


> 2s (mer? y 
tans = {~~ 5 : 

The calculation of Zint is more complicated. The internal motion of a 
diatomic molecule reduces to its rotation with respect to two mutually per- 
pendicular axes and the vibrations of the atoms about the equilibrium 
position. In the first approximation one can neglect the effect of small 
vibrations on the value of the moment of inertia of the molecule and assume 
the vibrational and rotational motions to be independent of each other 
(see §41). One can disregard the electron energy since it remains unchanged. 
Hence, according to (41.2), the energy of internal motion of the molecule 
can be written in the form 


Eint = Eyib + Erot - 


Accordingly the partition function for the internal motion resolves into the 
product of two factors: 


Zint = 27 exp [-(EyibtErot)/KT] Evin) Uero) = 
=i yy exp (—Eyib/KT) 2(Evip) ` 2 exp (—Erot/kT) Reo) = 
= Zvib Žrot - (42.4) 


Substituting the expressions for z from (42.2) and (42.4) into (42.1), we 
obtain 


| 
| 





202 IDEAL GASES Ch. 5 


H= A C trana)” CDI (Gedy 5 (42.5) 


From this the expressions for the thermodynamic functions can be found: 





F=-—kT\nZ = 
abe N N 
=—kT In ip KT In Z ot - AT Inz ib = 
= Frans + Fyib + Frot » (42.6) 


where Fang, Fvib and Frot denote the individual components of the free 
energy due to the translational, vibrational and rotational motions of the 
gas molecules. The expression for Frang is the same as the free energy of a 
monatomic gas [formula (37.7)] if in the latter the mass of one atom is re- 
placed by the total mass of the diatomic molecule. Analogously, 





ðlnZ 
IZ SNE ƏT = E transl + Evin i Erot , (42.7) 
OF 
Sie =? (fA) = Stranst + Syib + Srot > (42.8) 
c -(24) = Crp NCH ia NG (42.9 
K T/y V transl Vib Vrot ` 22) 


Thus, all the thermodynamic functions are resolved into individual com- 
ponents. Each component corresponds to one of the independent forms of 
motions of the diatomic molecule: the translational, vibrational or rotational 
motion. In order to calculate the thermodynamic functions of a diatomic gas 
it is necessary to find the corresponding partition functions of the internal 
motion. In the following sections we shall consider the partition functions for 
the vibrational and rotational motions. 


§43. The vibrational partition function and the contribution of vibrations to 
the energy and heat capacity 


In the first approximation a vibrating diatomic molecule can be considered 
as a quantum harmonic oscillator the energy of which is expressed by formula 


§43 VIBRATIONAL PARTITION FUNCTION 203 


(1.18). All energy levels of the oscillator are non-degenerate, i.e. with a 
weight Q = 1. Substituting the expression for the energy (1.18) into the 
partition function, we have 


Zvib = yy. exp [—Av(nt5)/KT] = evhv/2kT D e-hun/kT (43.1) 
n=0 n=0 


Making use of the well-known formula for the sum of an infinitely decreasing 
geometric series, we obtain 





oo 
-hv/2kT 
= e-hv/2kT > y (evhulkT yn = & 3 
Z e e 7 43.2 
vib n=0 C } }—e7hv/kT ( ) 


From formula (43.2) we see that the partition function and, consequently, 
the thermodynamic quantities also are determined by the value of the vari- 
able hv/kT. In the notation of §40 we can, expressing Av in energy units, 
write Av = kT, where Te is the so-called characteristic temperature. Formula 
(43.2) can be rewritten in the form 


e7Tel2T 
Agi) = aap 
1—e"Te/T 


Now we calculate the thermodynamic functions of a diatomic molecule 
corresponding to its vibrational motion. We first of all find the mean vibra- 
tional energy: 


Eyib = NEvin = NKT* Zin Zvib = 


a e-Tc/2T 
= NKT2 Č in &— s 
NKT ar! OT 7 


(43.3) 


For the vibrational heat capacity we find 


204 IDEAL GASES Ch. 5 








_ Nk i l as 
4 \KT) sinh? (hv/2kT) 


TANZ 
EEVA E) — , (43.4) 
4 \T! sinh? (7,/27) 





We see that the mean vibrational energy and heat capacity turn out to be 
complicated functions of the temperature T and the characteristic temper- 
ature T, (or the natural frequency v). Consider the limiting form of these 
functions at high temperatures (7>7,,) and low temperatures (7<7,). In the 
first case the exponential function can be expanded in a series, and one can 
restrict oneself to the first terms of the expansion. This gives 


Evy “NKT, (43.5) 


Cy, “NK. (43.6) 
At low temperatures eTeT > 1,,so that 

Evy SENKT +.NKT, 6 T/T , (43.7) 

Cy p © NK (Te/T)? efclT | (43.8) 


Formulae (43.5) and (43.6) agree with the classical formulae of §39. On 
the contrary, at low temperatures the expressions for the energy and heat 
capacity differ fundamentally from the classical ones. As the temperature 
decreases the vibrational energy tends to a constant limit Eg = INKT, = 4Nhv. 
The latter quantity, representing the vibrational energy of the molecules at 
absolute zero, is called the zero point energy. The existence of the zero point 
energy is a characteristic feature of quantum motion. It is an expression of 
the fact that in quantum theory the concept of a particle at rest is devoid of 
any physical meaning. 

The numerical value of Eg can be found from spectroscopic data. For this 
it is only necessary to find the value of the natural vibrational frequency v of 
the molecule. The heat capacity at low temperatures turns out to be a small 
quantity decreasing according to an exponential law, i.e. tending to zero as 
T> 0. Thus, the general scheme of the variation of the energy and heat 


§ 43 VIBRATIONAL PARTITION FUNCTION 205 


capacity with the temperature reduces to the fact that at high temperatures, 
when the thermal energy kT is large in comparison with the spacing 
Ae = hv=kT, between the energy levels, the heat capacity and the energy are 
given by the classical expressions; at low temperatures the energy tends toa 
limiting value — to the zero point energy of the quantum oscillator, while the 
heat capacity tends to zero. Such behaviour of these quantities is in agree- 
ment with general considerations. At high temperatures the magnitude of the 
quantum steps Ae is small in comparison with the thermal energy, so that 
the oscillator can be in a large number of excited quantum states, and its 
energy can be assumed to vary continuously, as in the case of a classical 
oscillator. On the contrary, at low temperatures the oscillator is always in the 
ground state and thermal excitation is insufficient to bring it into upper 
excited states. 

The curve showing the dependence of the mean energy of the oscillator 
on the ratio T/T., given by formula (43.3), is shown in fig. I1.25. From fig. 
111.25 it is seen that, when T approaches 7,, a smooth transition takes place 
between the limiting values (43.5) and (43.7). The basic difference between 
the classical expression for the mean energy of the oscillator (43.5) and the 
quantum expression (43.3) lies in the fact that in the latter case the energy 
depends on the frequency. Because of this the temperature still does not 
completely characterize the energy of the oscillator. For the same temper- 
ature two oscillators with different natural oscillation frequencies will have 
different energies. Fig. I1.26 shows the dependence of the energy of the 
oscillator on the frequency for a fixed temperature T. 

The heat capacity decreases smoothly with decreasing temperature from 
its classical value (43.6) down to zero. Thus, the vanishing of the vibrational 





Fig. 111.25 


eS 


reese ae - 


206 IDEAL GASES Ch. 5 


Evin— Shy 








Fig. 111.26 


heat capacity (the “freezing” of oscillations mentioned in §39) appears in 
the most direct and natural way in considering the properties of a molecule 
as a quantum oscillator. 

At low temperatures the vibrational frequency v turns out to be relatively 
very large in comparison with k7/h. There corresponds to a high frequency a 
great rigidity of the bond between the two atoms. With decreasing temper- 
ature the increase of the relative rigidity leads to a progressive damping of 
oscillations. 

So that one may have an idea of the order of magnitude of the quantities 
and, in particular, that of the characteristic temperatures of different mole- 
cules, we give in table 5 the corresponding values for a number of molecules. 
The values of the natural vibrational frequencies of the molecules are found 


from spectroscopic data. 








Table 5 
Molecule Characteristic Molecule Characteristic 
temperature temperature 
(10° K) (10° K) 
H2 6.0 HCl 4.14 
N2 3.34 HBr 3.7 
02 2.23 HI 32 


co 3.07 





§43 VIBRATIONAL PARTITION FUNCTION 207 


From table 5 the following conclusion, important from the practical point 
of view, can be drawn directly: since the characteristic temperatures of 
vibrational motion for all the molecules are of the order of several thousand 
degrees, a temperature of the order of 300 K corresponds to the limiting case 
T <T.. Hence the vibrational heat capacity is very small for most molecules 
at room temperature. For example, in the case of HI at 640 K a calculation 
according to the general formula (43.4) shows that the vibrational heat 
capacity amounts to about 0.33 J mol`! K~}. In most cases of practical im- 
portance at not very high temperatures it can be assumed that the vibrational 
motion is damped and that its contribution to the heat capacity is zero. In 
any case it is substantially smaller than that to be expected from the law of 
equipartition. The vibrational part of the heat capacity depends on the tem- 
perature and is different for molecules of different substances. 

We now consider the calculation of other thermodynamic quantities. The 
free energy due to the vibrational motion has the form 


Fyip = —NKT In Zy = 4Nhv + NKT In (1—evkT) = 
=4NkT, + NKT In(1—e727) . (43.9) 


Correspondingly, the entropy is 





OF vip 7 Nhv 1 
Sn ome NEE dah SASi 
bes =) 1 -TAT 
= we( T T —Nk In (1—e ye (43.10) 


At high temperatures one can carry out the expansion in powers of the 
ratio T/T and confine oneself to the first terms of the expansion. This gives 


Fin ~ Eo + NKT ln (T/T) > 
S = Nk — Nk ìn (T/T) . 
On the contrary, at low temperatures eTcT < 1, so that 


Fyip ~ Eo . Syb © 0. 


For the practical calculation of the partition function of the vibrational 





208 IDEAL GASES Chiro: 

















—L 

2 
fn 
High temp. € =;7 


Fig. [11.27 


motion and the thermodynamic quantities it is necessary to know a charac- 
teristic molecular constant: the natural vibrational frequency v of the mole- 
cule. Its value for most diatomic molecules is known from spectroscopic data, 
in particular from infrared vibrational spectra. 

The values of the functions involved in formulae (43.3), (43.4), (43.9) 
and (43.10) are tabulated and are found directly from tables. Their depen- 
dence on the ratio hv/kT is shown in fig. III.27. 


§44. The rotational partition function and the contribution of rotation to 
thermodynamic functions 


We shall consider now the partition function for the rotational motion of 


a diatomic molecule. 
The energy of the rotator takes on a discrete series of values 


oe MEY, (44.1) 
872I 
where J = 0, 1, 2, 3,.... 
Each state with a definite rotation energy, i.e. with a definite value of the 
rotation quantum number, turns out to be (2J+1)-fold degenerate. Hence the 


partition function of the rotational motion has the form 


§44 ROTATIONAL PARTITION FUNCTION 209 


By h2J(J+1) 
Zo, = 2 (2J+1) exp ( ew (44.2) 
rO 872IkT 


The partition function z,,, depends on the ratio h? |8n2IkT = T/T, where 


T, = h?/8n2kI is the characteristic temperature for the rotation. Thus, 
oo 

D TJ (J+1) 

z= ee (2441) exp -=F . (44.3) 


Table 6 gives the values of the characteristic temperatures for the rotation 
of different diatomic molecules. 








Table 6 
Molecule Characteristic Molecule Characteristic 
temperature temperature 
(K) (K) 
H2 85.4 O2 2.07 
D2 43 HCl 15.1 


Na 2.85 HI 9.0 





From table 6 it is seen that, in contrast to the characteristic temperatures 
for the vibrational motion, the characteristic temperatures for the rotational 
motion are extremely small and lie considerably below the condensation 
point of a gas at normal pressure. Exceptions to the rule are the molecules 
H and D3, for which the characteristic temperatures are relatively large and 
lie above the condensation temperatures. The high characteristic temperatures 
of H, and D3 are due to the smallness of their moments of inertia (/(H2) = 
yyy a2 and /(D2) = impa?, where my and mp are the masses of the proton 
and deuteron, and a is the distance between them in the molecules). Hence 
for all molecules, except Hə and D3, it can be assumed that the spacing 
between two successive rotational energy levels is small in comparison with 
the thermal energy. In other words, the temperature is always high with 
respect to the rotation of heavy molecules. 

For high temperatures the summation over individual energy levels in 
(44.3) can be replaced by integration over the almost continuous levels: 


210 IDEAL GASES Ch. 5 


TJ(J+1 
= Dea en) = 





Zrot Ti 

r TJU+1) 

= f (2J+1) ap -=) dJ (44.4) 
0 
Introducing a new integration variable, y = J(J+1), we find 
co 
Ty T _ 802IkT 

inet as J >l 2 a= F- os (44.5) 


it We thus arrive at the classical expression for Zot- 
iy In the next approximation the partition function can be calculated by 


s | 1 means of the well-known Euler summation formula *: 
D= f se) dx +4/0) - ASO) + Hof"). 
J=0 0 
In the given case f(J) = (2J+1) exp [-TJ(J+1)/T], so that 
f(O)=1; TiO) es Lic Ds F” ~- 12 T/T 


i and for Z,,,; we obtain 
M rot 


\ 72 2 
\ on AT + (1 h j (44.6) 
l h2 3 15X872?IkT 





f 
The second and third terms represent the quantum correction to the 
} classical value of z,,,. From formula (44.6) it is seen that this correction is 
! small at a temperature T higher than 7,, and that it decreases rapidly with 
| increasing T. 

| Taking into account the quoted values of the characteristic temperatures, 
'I it can be said that the entire range of realistic temperatures lies much higher 
1 


| * See, for example, A.Gelfond, /schislenie konechnykh raznostei (Calculation of 
finite differences), (Gostekhizdat, Moscow, 1952) p. 343. H.Margenau and G.M.Murphy, 
Mathematics of physics and chemistry (D.Van Nostrand, New York, 1948). 


ry 


§44 ROTATIONAL PARTITION FUNCTION 211 


than T,. Hence the quantum corrections to Z,o, for all except the lightest 
molecules, play a negligible role. 

For low temperatures (T<T,) in the general expression for Z,o, only the 
first, that is the largest terms should be retained. This gives 


roti wusteS EXD (—21g/ 1) (44.7) 


Now we find the terms of the thermodynamic functicns associated with 
the rotational motion. Obviously, we have 





y ð In Zrot 
Exot = NKT? 3 = 
a) 
= 2 2 ie 
NKT: ar i Do J+1) exp(- T p (44.8) 
For high temperatures we have 
E ot = NKT (: — | (44.9) 
bs 24n21kT ) 


Correspondingly, the rotational heat capacity at high temperatures has the 
classical value: 


= Nk 
Cot Nk. (44.10) 


At low temperatures 


E = NAT? Stn | + 3exp(- h? )] ~ 
ar 4n2UkT 


~ 3h2N exp ( h ) 
4n2] 4n21kT Cee) 


and the heat capacity is 


2 \2 2 
Cy ~ 3( h ) N exp ( k ) s (44.12) 
rot 4n2I) kT? 4n2IkT. 











s 


f 
i 
i 





212 IDEAL GASES Ch.5 


Thus, at very low temperatures the rotation energy and heat capacity turn 
out to decrease exponentially with the temperature. As we have stressed 
before, the decrease of the heat capacity with the temperature according to 
the exponential law can be observed only for the lightest molecules. 

We write, in addition, the expressions for the free energy and entropy. 
At high temperatures 


> T-J(J+1) 
Fiot = NKT In Zot = -NKT in| (2J+1) exp == = 
J=0 
( e ie ) 44.13 
~ NKT | In TO3T (44.13) 
and 
Te 
Siot © —Nk In T +Nk. (44.14) 


At low temperatures, taking into account (44.7), we obtain 
F ot © —3NKT exp (—2T,/T) (44.15) 


and 


Te 


6Nk 
Stor © 3K exp (—2T,/T) + T exp(—27,/T) . (44.16) 





Thus, the rotational energy and entropy decrease exponentially at very low 
temperatures. The general form of the heat capacity associated with rotation 
has the same character as that of the heat capacity associated with the vibra- 
tions of the molecules: at high temperatures the heat capacity tends to the 
classical value, while at low temperatures, in accordance with the require- 
ments of the third law of thermodynamics, the heat capacity tends to zero. 
However, the notions of high and low temperature for rotations and vibra- 
tions turn out to be fundamentally different: for vibrations room tempera- 
ture, as a rule, should be considered low, whereas for rotations it should be 
considered high. 

As is seen from formulae (44.8), (44.12) and the subsequent formulae, it 
is necessary to know only one molecular constant, the moment of inertia / of 
the molecule, for the actual calculation of the partition function and ther- 
modynamic quantities. The value of this quantity is known for most diatomic 


§45 POLYATOMIC MOLECULES 213 


molecules from spectroscopic data, in particular from infrared rotational 
spectra. If the rotational—vibrational spectrum of the molecule, i.e. the energy 
levels of its rotational and vibrational motions are known, then the calcula- 
tion of the partition functions can be carried out by means of a direct 
summation *. 

In conclusion we note that in the case of hydrogen and deuterium it is 
necessary to take into account the effect of the nuclear spin on the rotational 
motion. It turns out that the nuclear spin fundamentally affects the charac- 
ter of the rotational states of molecules consisting of identical atoms. In 
particular, depending on the value of the nuclear spin the molecules of 
hydrogen can be in rotational states of two types. 

In states of the first type, corresponding to the total spin of the two 
nuclei being equal to zero, the rotational quantum number J runs over a series 
of even values J = 0, 2, 4, 6, .... In states of the second type, corresponding to 
the total spin of the two nuclei being equal to unity, the quantum number 
runs over a series of odd values J = 1, 3, 5,.... 

Molecules of the first type are called parahydrogen, and those of the 
second type are called orthohydrogen. In normal conditions there are no 
transitions between orthohydrogen and parahydrogen, so that the gas as a 
whole must be considered as a mixture of the two different types. 

This fact affects the form of the calculated thermodynamic functions ** 
in an essential way. 


§45. Polyatomic molecules 


The consideration of polyatomic molecules differs little in principle from 
that of diatomic molecules. The partition function of a polyatomic molecule, 
like that of a diatomic molecule, can be written in the form 


Z = Z transl Zvib Zrot > (45.1) 


provided the effect of the vibrations on the rotation of the molecule (in 
connection with the change in the size of the latter) is disregarded. The parti- 


* For more detail see I.N.Godnev, Vychislenie termodinamicheskikh funktsii po 
molekularnym dannym (Calculation of thermodynamic functions from molecular data) 
(Gostekhizdat, Moscow, 1956). A.H.Wilson, Thermodynamics and statistical mechanics 
(Cambridge University Press, Cambridge, 1957). 

** See, for example, L.D.Landau and E.M.Lifshitz, Course in theoretical physics, 
Volume 5, Statistical physics (Pergamon Press, London, 1958). 


ee 


214 IDEAL GASES Ch. 5 


tion function of the translational motion differs in no way from that cal- 
culated before. However, the calculation of the partition function of the 
internal motion of polyatomic molecules is incomparably more complicated 
than for diatomic molecules. 

In considering the rotational motion of a molecule it is necessary to 
distinguish between three cases: the linear molecule, the symmetric top, and 
the asymmetric top. The rotational motion of a linear polyatomic molecule 
does not differ from that of a diatomic molecule. For a symmetric top two 
major moments of inertia are equal to each other (/;=/24/3), whereas for an 
asymmetric top all moments of inertia differ from each other (/;#/,#/3). 
In the first case quantum-mechanical considerations allow one to calculate 
the rotational energy levels of the molecule, which are expressed by a for- 
mula similar to that for the energy levels of a simple top. However, there is 
no explicit expression for the energy levels of an asymmetric top. In the case 
of molecules of the type of asymmetric tops use is usually made of certain 
approximate expressions for the energy levels, the accuracy of which is not 
very great. However, the situation is made substantially simpler by the fact 
that the characteristic temperatures for the rotation of polyatomic mole- 
cules are, as a rule, even lower than those for diatomic molecules. Hence 
the ordinary temperatures at which one can work with non-condensed poly- 
atomic gases are relatively high, and for the partition function of the rota- 
tional motion use can be made, without any appreciable error, of the classical 
expression for Z,,,. Thus, for example, the difference between the quantum 
and classical expression for the rotational state function for the HCN mole- 
cule at a temperature of 100 K amounts to about 0.5%, and for the CH3 CI 
molecule it amounts to about 1%. At a temperature of 300 K this difference 
becomes quite negligible, and lies beyond the limits of accuracy of the 
measurements. In most calculations of the partition function of polyatomic 
molecules use is made of the classical approximation. 

A characteristic feature of a large number of polyatomic molecules, in 
particular the molecules of organic compounds, is the presence of a certain 
number of identical atoms in them. The symmetry of a molecule is closely 
associated with the presence of identical atoms in the molecule. Because of 
the presence of symmetry the molecule can be matched with itself in definite 
rotations, just as a diatomic molecule containing two identical atoms matches 
with itself when rotated through 180°. The presence of symmetry in the 
molecule requires the introduction of a symmetry factor y into the rotational 
partition function. A symmetry factor y= 2 should also be introduced for 
diatomic molecules with identical nuclei. It represents the number of 
physically indistinguishable positions of the molecule when it rotates as a 


§45 POLYATOMIC MOLECULES 


N 
U 


rigid body. To obtain a correct expression for the rotational partition func- 
tion, in which each physical state would be taken into account only once, the 
partition function obtained in integrating over all values of the angle of 
rotation (in the classical approximation) must be divided by the symmetry 
factor y. With the introduction of the symmetry factor the rotational 
partition function of a polyatomic molecule with three different moments of 
inertia in the classical approximation can be written in the following form: 


1 
Zrot = we [8n2(KT)3 I lal3]?, (45.2) 


where the moments of inertia are expressed in g:cm2, and the temperature is 
measured on the absolute scale. If the structure of the molecule is known, 
then the values of the factor y are found from simple considerations of the 
symmetry. In the case of the linear molecule CO, (see fig. I1.19) the factor 
of symmetry is y= 2, so that the molecule is matched with itself when 
rotated through the angle 7. The non-linear molecule SO, (fig. 11.19) is also 
matched with itself when rotated through the angle 7, and so for it y= 2. 
The methane molecule CH4 represents a regular tetrahedron with the carbon 
atom in the centre. It is matched with itself when rotated through an angle 
of 120° about the vertical axis and when each of the four vertices of the tetra- 
hedron are matched, all together in 12 rotations, so that y= 12. The am- 
monia molecule NH3 represents a pyramid with the nitrogen atom at the 
vertex. It is matched with itself when rotated through an angle of 120° about 
the vertical axis, so that y = 3. 

The vibrational motion of polyatomic molecules is also incomparably 
more complex than that of diatomic molecules. The number of vibrational 
degrees of freedom amounts to 37 — 6 for non-linear polyatomic molecules 
and to 3n — 5 for linear ones, and can be rather large for complex molecules. 
For example, the molecule SO, has three vibrational degrees of freedom, the 
molecule NH3 already has six, and for the molecule CgHg the number of 
vibrational degrees of freedom is equal to 30. The study of the vibrations of 
such systems is a complex problem. Nevertheless, the vibrational motion of a 
very large number of molecules has been investigated. If the departures of the 
atoms from the equilibrium position are assumed to be small (which is not 
always possible in the case of polyatomic molecules; see below), then the 
motion of the system will represent small vibrations and the vibrational 
motion of the molecule can be resolved into a set of independent normal 





216 IDEAL GASES Ch. 5 


vibrations *. To each degree of freedom there corresponds one normal vibra- 
tion with its natural frequency. The frequencies of the normal vibrations (the 
natural frequencies of the system) are connected with the masses of the 
nuclei and the constants of the quasi-elastic forces by the usual relations. In 
the general case the frequencies of all the normal vibrations are different. 
However, the frequencies of certain normal vibrations are often the same. In 
this case the vibrations are degenerate. 

The values of the natural frequencies can be found from an analysis of 
infrared spectra as well as the dispersion spectra of the molecules, although 
this is not at all a simple problem. The spectra of a number of simpler mole- 
cules have been investigated in sufficient detail, and their natural frequencies 
have been determined with a high degree of accuracy. Fig. I11.19 presents the 
normal vibrations of certain typical molecules (H30, SO% and CO >). Each 
normal vibration with a natural frequency »; gives its contribution to the 
partition function z,, which, by virtue of the independence of normal 
vibrations can be written in the form of the product of the corresponding 


factors: 





3n—6 3n—6 
Tl exp(—hvj/2kT) _ exp (7/27) 
pe aed eee A (45.3) 


1—exp (—Av;/kT) — ii 1—exp(—T/T) ` 


/=1 = 
where T is the characteristic temperature of the /th normal vibration. The 
values of TO for different normal vibrations can differ considerably from 
each other. Thus, for example, the six normal oscillations of the ammonia 
molecule have the following characteristic temperatures m (in 10? K): 
13.6, 23.3, 23.3, 47.8, 48.8, 48.8. We see that characteristic temperatures can 
differ from each other by a factor of three. The difference between charac- 
teristic frequencies is associated with the difference between the constants 
of quasi-elastic forces (the difference in the rigidity of the atomic bonds in the 
molecules). The characteristic temperatures of polyatomic molecules, as well 


* In arbitrary coordinates the potential energy of a system of vibrating points has the 
form 


U= Dy GK EEK. 


where ¢ are the displacements. To find the normal vibrations it is necessary to find the 
coordinates £j such that the potential energy of the system has the orthogonal quadratic 
form U= az ajk. The choice of new variables can be made in a purely algebraic way. 
but is simplified essentially if use is made of the symmetry properties of the system (see, 
for example, L.D.Landau and E.M.Lifshitz, Mechanics (Pergamon Press, Oxford, 1960). 


§45 POLYATOMIC MOLECULES 217 


as those of diatomic molecules, amount to several hundred or thousand 
degrees. Hence the contribution of vibrations to the heat capacity at moderate 
temperatures is relatively small. In any case, the vibrational part of the heat 
capacity is many times smaller than that to be expected from the law of 
equipartition. 

As an example one can consider the ammonia molecule. The total heat 
capacity of the molecule NH3 amounts to 


Cy = +Cy ot Cr 


V transl “vib 


$434 Cr, (in units of Nk) . 


Table 7 gives the values (in J mol~! deg!) of the vibrational heat capacity 
Chib calculated according to (45.3) and observed experimentally. 











Table 7 
S) CWib CVyib ES CWin Cit 
(calc.) (exper.) (calc.) (exper.) 
243 0.50 0.59 423 2.93 3.14 
273 0.80 0.92 582 5.48 5.02 
303 1.21 1.30 655 7.20 6.28 
334 1.55 1.88 796 8.62 7.95 


383 2.34 2.51 





For T> 240K the rotational heat capacity can be assumed to have the 
classical value $. Thus, at a temperature of about 800 K the vibrational heat 
capacity amounts to about $ of the total heat capacity of the molecule and 
can in no way be disregarded. Even at room temperature Cip amounts to 
about 7% of the total heat capacity. Nevertheless, it is considerably smaller 
than that to be expected from the law of equipartition (6Nk). Such a situation 
is characteristic of most polyatomic molecules. 

The contribution of different forms of motion to the value of the entropy 
of the ammonia molecule is shown in fig. III.28. 

The vibrational motion of polyatomic molecules has a remarkable feature 
which has no analogue for diatomic molecules. Namely, very often the ampli- 
tude of zero-point vibrations of definite groups contained in the molecule 
turns out to be so large that the corresponding motion ceases to be harmonic 
or loses completely its vibrational character. This is seen most clearly by 
concrete examples. A large number of polyatomic molecules, in particular 








218 IDEAL GASES Ch. $ 





Fig. III.28 Fig. [11.29 


organic ones, contain individual groups or radicals having the character of 
independent groups; for example, the ethylene molecule C3H4 represents a 
combination of two CH, groups. Similarly the ethane molecule C3 Hę con- 
sists of two CH3 groups. The dimethylacetylene molecule CH3—C=C—CH 3 
contains two CH3 groups and a carbon core. Owing to the existence of inter- 
action between the hydrogen atoms the potential energy of each of the CH3 
or CH3 groups has a minimum for a definite orientation of one group with 
respect to the other. That is, to the minimum of the potential energy there 
corresponds a value of the angle of rotation (measured from the median line) 
of one CH3 group with respect to the other equal to 60° or 180°. In other 
words, the potential energy has a minimum when the two groups are situated 
as an object and its mirror image (fig. 111.29). When a displacement from the 
equilibrium position occurs (a rotation of one of the groups with respect to 
the other) the potential energy increases and there arises a force tending 
to bring the molecule back to the equilibrium arrangement. Then rotational 
oscillations about the axis of the molecule arise. If, however, the zero-point 
energy of these oscillations is so large that it exceeds the potential barrier 
hindering the group from rotation, the rotational oscillations turn over into 
a free rotation of the group with respect to the axis of the molecule. The 
latter case is encountered relatively seldom. The dimethylacetylene molecule 
quoted above serves as an example of it. In this molecule the two CH3 groups 
are a relatively large distance apart, and their interaction is not very large. 
Hence the height of the barrier preventing a rotation turns out to be relatively 
small, and the CH3 groups rotate freely. However, in most cases the rotation 
of individual groups is hindered. At low temperatures rotational oscillations 


§45 POLYATOMIC MOLECULES 219 


of a large amplitude take place, which turn over into a rotation at very high 
temperatures, when the thermal energy kT proves to be higher than the 
barrier height Ug. The existence of free rotation changes the value of the 
heat capacity and other thermodynamic quantities in comparison with that 
for molecules without rotations: a part of the vibrational degrees of freedom 
is replaced by rotational ones. If the rotation is assumed to be free, then the 
calculation of the heat capacity and other thermodynamic quantities presents 
no difficulty, since the rotation of a relatively heavy group can be assumed 
to be classical. But if the rotation is hindered, then it is necessary to know for 
the calculation the height of the barrier hindering the rotation. Finding this 
height from spectroscopic data is very difficult. Hence one proceeds in the 
reverse order: one calculates thermodynamic quantities, most often the 
entropy, by assuming different values of the height of the barrier, and com- 
pares the calculated and measured values. The matching of the theoretical 
and experimental curve of the dependence of the entropy on the temperature 
allows one to find the height of the barrier. For different molecules the 
height of the barrier varies within a rather wide range. Thus, for the rotation 
of the CH3 group in the ethane molecule H3C—CH3 the height of the barrier 
amounts to about 1570 K, so that the rotation at a temperature T> 1570 K 
is free. For the ethylene molecule CH4 the barrier hindering the CH3 groups 
from rotation has a height of about 6000 K, so that the rotation at room 
temperature is strongly hindered and there take place rotational oscillations 
of a relatively small amplitude. 





on 


ae an a 





Systems of Interacting Particles 


§46. Interaction between molecules in non-ideal gases 


Up to now we have restricted ourselves to the study of the properties of 
gases so rarefied that the interaction between the molecules may be dis- 
regarded. We now pass on to the consideration of the statistical behaviour of 
systems of interacting particles. 

In §6 we have touched already upon the problem of the character of the 
intermolecular interaction. At large distances between molecules this inter- 
action reduces to a weak attractive force which decreases rapidly with in- 
creasing distance between the centres of the molecules. At small distances, 
when molecules approach each other closely, so that a mutual penetration 
of their electron shells takes place, a very strong repulsion arises. Owing to 
this repulsion the molecules cannot penetrate very much into each other, 
and they cannot get deformed in collisions. In what follows we shall confine 
ourselves to a monatomic gas, and shall assume that the interaction depends 
only on the distance between the atoms. 

The form of the potential energy of the interaction between two mole- 
cules is shown in fig. 111.30. We shall assume that the attractive forces are 
so weak that the largest value of the potential energy of attraction occurs 
when the molecules approach closely (the distance between the centres is 


220 


§46 MOLECULES IN NON-IDEAL GASES 221 


equal to the diameter d). But |u(d)| is nevertheless small in comparison with 
the thermal energy kT, 


lu(d)| <kT. (46.1) 


The potential energy decreases so rapidly with increasing distance that it 
is practically reduced to zero at distances between the centres of the mole- 
cules amounting to only a few diameters. We formally introduce a certain 
distance p such that beyond it the interaction can be completely disregarded. 
We shall call this distance the radius of interaction. This means that we replace 
the true potential energy curve by the simplified curve shown by a dotted line 
in fig. I11.30 and expressed by the formulae 


0, rp, 
u=\—u(r), p>r>d, (46.2) 
oo, d2r. 


Here r denotes the distance between the centres of the ith and kth mole- 
cules. This means that the interaction is absent when the distance between 
the centres of the molecules exceeds p, represents an attraction [negative 
value of u(r)] when the distance between the centres is smaller than p but 
larger than d, so that the molecules do not directly touch each other, and 
turns into a very strong repulsion (infinitely strong in our approximation) 


u(r) 

















Fig. 111.30 


pa ad 


222 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


when the molecules come into direct contact. The value of p is as a rule equal 
to three or four molecular diameters. 

If the gas is not very dense, then the mean distance between the molecules 
is very large in comparison with their dimensions. Hence it can as a rule be 
assumed that not more than two molecules approach each other to within the 
distance of interaction at once. In other words, it can be assumed that the 
molecules interact only in pairs. Configurations in which a “cluster” of 
three, four, or more particles are found simultaneously in the sphere of in- 
teraction are seldom encountered, and we shall disregard them. 

We calculate the partition function of the gas in this condition. The energy 
e of the entire gas can be written in the form 


€=ei, tU. (46.3) 


The first term in (46.3) expresses the sum of the kinetic energies of the mole- 
cules. It is the same as the energy of an ideal gas. The second term represents 
the potential energy of interaction of the molecules, depending only on their 
mutual distances. Making use of this expression for e, one can write the parti- 
tion function of the gas in the form 





Z eN (farza dp, dp, dp: )” (fevnr d Vi ad vy) s (46.4) 


where dV; ... dV y is the product of the differentials of the space coordinates 
dxdydz for every one of the molecules. The first factor does not differ from 
the corresponding quantity for an ideal gas. By virtue of the results of §37 
it can be written in the form 


fev pmkr dp, dp, dp, = (2nmkT)? , 


so that 


E en nmin if eUKT dV, ... avy). (46.5) 


Here we have to calculate the second factor, called the configuration inte- 
gral: 


I= fe-UKT AV, dV ... Vy. (46.6) 


§46 MOLECULES IN NON-IDEAL GASES 223 


To do this we make use of the assumption that the molecules interact with 
one another only in pairs. Hence the energy of interaction can be written in 
the form of the sum of the energies of interaction of the molecular pairs 


U= DS u(r), (46.7) 


where a pair is understood to be two molecules which are at a distance smaller 
than the radius of interaction p. The energy of interaction of each pair is 
denoted by u. It is determined by formula (46.2). The number of terms in 
the sum (46.7) is equal to the number of pairs formed in a gas of M molecules. 
It is equal to the number of combinations of N elements in twos, i.e. to 
3M(N—1). For large N this number can be assumed to be equal to 3N2. Then 


w(i) ae) Clea). 


where the product is taken over all the pairs, i.e. 


exp (-)- exp( Suer) exp( ga saa (46.8) 


This product contains 4N2 factors. Each term in this product tends to unity 
when rj. > p, since u(r) > 0. It is more convenient to introduce a function 
Jig defined by 


(rig) 
fia = exp (—Gz-)- 1, (46.9) 








which tends to zero when rj, >p and differs from zero only when rig <p. 
Then, obviously, eK VET = | + fg, and 


e Ukr = |] (14f,) = 
= (1+f,2) tf13) Otfi4) -= 
=1+ (othisthigt.-.) 


+ (fiofisthiafiat-..) + -- (46.10) 


” 


wy 


224 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


Indeed, the double, triple and so on products of the function fig are, by 
definition of this function and by virtue of the assumption of the absence of 
clusters, always very small. Thus, for example, in order that f}2f}3 may 
differ substantially from zero it is necessary that fy and f}3 should at the 
same time be different from zero, i.e. that the distances rj7 and 7,3 should at 
the same time be small (smaller than p). This means that the first, second 
and third molecules simultaneously come into the range of interaction p, 
forming not a pair but a triplet of molecules. Similarly f) 2/3/14 differs from 
zero only if fia, fı3 and fj4 are not at the same time equal to zero. This 
happens only when the first, second, third and fourth molecules simulta- 
neously get into a region of the order of p. Hence with a sufficient degree of 
accuracy it can be written that 


eURT ~ 1 + (fyotfigt..)= 1+ 2O fi - (46.11) 


The number of terms in È fj, is equal to the number of pairs, i.e. to 34N2. 
Since all the molecules are identical, it can be assumed that all fj, are the 
same, so that 


e-UKT ~ 1 +5 N?f(rig) - (46.12) 
Substituting the expression e-U/kT from (46.12) into (46.6), we have 


[= fe-URT AV, ... Vy = f U+4N2 fig) dV ~. Vy = 
= fav, ... Vy + 4N2 | f AV; ... Vy - (46.13) 


The first integral in (46.13) is obviously equal to VN. In the second 
integral the integration over all volume elements except the ith and kth 
gives 


[av AV ny Wiri Vier Wier ~ Vy f fix WV; AVE = 
= VN-2 [fy AV; Vg - 


Thus, 
1 = VN +4N2YN-2 [fy dV; dV « 


§46 MOLECULES IN NON-IDEAL GASES 225 


To carry out the last integration we introduce spherical coordinates with 
the centre located at one of the molecules. Then rj, =r, and 


fa av; avg = J| exp (E) hrar = 
= fav flexp ee) 1] 4nr2ar, 


where 4r is the result of integration with respect to the angles. Hence, writing 


4r Sexe (9) -1| r2dr=6, (46.14) 





we obtain 
thre: dV; dV, = VB. 


For Z the final expression is the following: 
N2 
= VN |] += 
T= ( wo). (46.15) 


Substituting the expression (46.15) for J into (46.5), we have 


3 
= 2nmkT YN VN ( rE ie (+ AP), (46.16) 
N'!h3N 2V 2V 
where Zig denotes the partition function of the ideal monatomic gas. It 
should be noted that the value of N26/2V = NV-1-4N6.is small for a small 
gas density NV-!. 

By means of the partition function (46.16) one can calculate the ther- 
modynamic functions of a gas which departs slightly from an ideal gas. We 
shall restrict ourselves to the calculation of the pressure, since the equation 
of state of a gas is of very great interest. 

The departure of a gas from an ideal gas is taken into account by means of 
the van der Waals equation, which for small gas densities can be written in 
the form 





NKT _ N?a NKT ,N?kTb _ N?a 


TIES a. aa oe (46.17) 


~ 


“pa 


226 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


Since in the course of the calculations of the preceding paragraph we have 
not taken into account “clusters”, the results obtained hold for small gas 
densities. A simple calculation of the pressure, based on the partition func- 
tion (46.16), leads to an expression which is exactly the same as (46.17). 
Indeed, according to (32.5) the pressure p is equal to 











_ 7 dInZ_,, 0 InZiq a N26 

p= L o | 

w a ee N2kTB 

~pa + TEE (MY E A (46.18) 


Here, assuming the density to be small, we have expanded the logarithm in 
a series in powers of the quantity M26/2V which is very small in comparison 
with unity, and have confined ourselves to the first term of the expansion. 

Comparing formula (46.18) with (46.17) we convince ourselves of their 
complete identity provided we assume 


B=—-b. (46.19) 


N= 


a 
kT 
Thus, formula (46.18) represents the van der Waals equation, derived theo- 
retically for small gas densities. The foregoing calculation refers to the case of 
monatomic gases. It can, however, also be shown that in the case of complex 
polyatomic gases the qualitative aspect of the derivation is not changed, 
although the explicit form of the quantity B will be more complicated. 

To explain the meaning of the constants a and b occurring in the van der 
Waals equation, we shall consider in detail the quantity $. 

By definition, 


B=4n ff f(r)r2dr . (46.20) 
(0 
Substituting the expression for f(r) into (46.20), we have 
r 
aza exp (-9)-] r2dr. (46.21) 


We divide the integral (46.21) into two parts: the integral in the interval 
0<r<d and the integral in the interval d <r < %, i.e. 


§46 MOLECULES IN NON-IDEAL GASES 227 


d co ‘ 
B= 4n f [exe (p rears anf [ xe (-32)- 1 | ar. 


0 d 


In the first interval, by virtue of (46.2), e“@/KT ~ 0, and thus in the first 
integral the exponential term can be dropped. In the second integral the 
potential energy of interaction of the molecules is. by virtue of (46.1), small 
in comparison with the thermal energy kT, so that it can be written approxi- 
mately that 


u(r) \ u(r) _ ju(r)| 
exp ( Pz RTT: 





We then have 





d co oo 
x 4r en 4T 4n 
B= —4n f Part Te fi uO- a3 tiS urar, 
0 d d 
Substituting B into (46.19), we obtain 
3 co 
-2T a f \u(ryrar= Ir È: (46.22) 
d 


Equating in (46.22) the coefficients of T7! and the constant terms, we find 
b= 3nd3 = 4v9, (46.23) 


where Up is the volume occupied by the molecule. Thus, the constant b in the 
van der Waals equation turns out to be equal to the volume of the molecule 
multiplied by four. Further, 


fs) 


a=4-47 ff lu(r)ir2dr . 
d 


The constant a is expressed in terms of the integral of the potential energy of 
interaction of two molecules. Since the function u(r) rapidly decreases with 
increasing distance between the molecules, this integral converges rapidly. 
Thus, 


B= rua - Svg: (46.24) 


228 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


Depending on the temperature, B can be positive or negative. At a sufficiently 
low temperature ß > 0, while at a high temperature 6 < 0. 
If the expressions which we have found for the constants a and b are sub- 


stituted into the van der Waals equation, we then obtain 





_NKT (oN 2nN FW P 
p=-} (1+ y VEE y lu(r)ir?dr }. (46.25) 


In the first approximation, when the gas density is so low that the probability 
of the simultaneous presence of three or more molecules in the sphere of 
interaction may be disregarded, the pressure in a non-ideal gas differs from 
that in an ideal gas by two terms. The first of:these represents the ratio of 
four times the volume of all the molecules to the entire volume of the gas. 
The meaning of this (positive) correction to the pressure lies in the fact that 
it takes into account the volume of real molecules. In this approximation we 
cannot consider molecules as material points having no spatial extension. 

The second correction to the pressure is negative and is in absolute value 
equal to the ratio 


ea A jncnyr2ar). 


This ratio also has a simple physical meaning. The quantity 


co 


ref ju(r)\r2dr =a 


represents the mean value of the potential energy of interaction of a pair of 
molecules. This mean value is taken over all possible distances between the 
molecules, i.e. over the entire volume available for the motion of the mole- 
cules. Then, obviously, V2i/2V is the mean value of the energy of interaction 
of all pairs of molecules existing in a unit volume of the gas. The second cor- 
rection thus characterizes the decrease in the pressure of gas molecules on the 
walls of the container due to their mutual attraction. This fact can be ex- 
pressed in another way as follows: in the gas there exists an internal pressure 
due to the attraction of molecules. 

As is well known, the van der Waals equation describes not only the 
properties of gases with a relatively low density but also those of very dense 
gases and even liquids. However, in this case it cannot be derived theoretically, 
and represents a purely empirical equation which should be considered as a 


§46 MOLECULES IN NON-IDEAL GASES 229 


more or less successful extrapolation from the region of low densities. How 
this extrapolation is to be made is seen from eq. (46.17), if it is rewritten in 
the form 


2 
NET ce N2a 


z 7 v2 (46.26) 
The equation is valid for Nb/V <1, i.e. when the volume 4p occupied by 
all the molecules is very small in comparison with the volume of the gas. 

If, however, the density of the gas increases, which can be characterized 
by a decrease in the volume V for a fixed N, then formula (46.26) loses its 
validity. It is physically clear that, if the gas is compressed to the limit of 
close packing of the molecules with the minimum gap between them, a 
correct formula for the pressure should point to an infinite increase in it. A 
further compression, associated with a deformation of the atoms, would be 
associated with enormous pressures which would be infinitely large in com- 
parison with ordinary pressures in gases or liquids. From geometrical con- 
siderations it is clear that to the close packing of spherical molecules there 
corresponds a volume of the system equal to 4v N = Nb. Consequently, a 
correct formula for the pressure should lead to indefinitely increasing values 
of p as V> Nb. However, formula (46.26) does not have such a character. 
But if the factor 1+ NbV-! is considered as a result of the expansion in a 
series of the quantity (1—VbV-!)-!, then we obtain immediately the follow- 
ing formula for the pressure: 


NKT N2a 


are ae N creme 
VU-NbvV-!) y2 





: (46.27) 


which satisfies the required conditions: 
1) p increases to infinity as V > Nb , 
2) for V > Nb formula (46.27) goes over into the theoretical formula (46.17). 
Formula (46.27) is the complete van der Waals equation which describes 
the state of gases over a wide range of densities. From the very character of 
the derivation it is clear, however, that this equation cannot have the impor- 
tant theoretical meaning which is possessed by eq. (46.17). For large gas 
densities the constants a and b no longer have an exact meaning, and can only 
approximately be considered as the characteristics of the volume of the mole- 
cules and their interaction. This is seen, in particular, from the fact that in 
order to obtain quantitative agreement of eq. (46.27) with experimental 
data one has to renounce the constancy of the quantities a and b and con- 


230 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


sider them as functions of temperature. The inconvenience of this led many 
investigators to propose other empirical equations of state. Nevertheless, a 
great merit of the van der Waals equation is the fact that it gives qualitatively, 
in a very correct way, the behaviour of gases, and it contains indications 
about the transition of a gas into the liquid state and critical phenomena. 


§47. The correlation function method and its application to the theory of 
dense gases and liquids 


We have seen that a direct calculation of the configuration integral for a 
system of interacting particles turns out to be a rather complex procedure. 

In connection with the attempt to create a statistical theory of liquids, 
different methods of approach to the consideration of the statistical proper- 
ties of systems of interacting particles have recently been developed. The 
correlation function method developed by N.N.Bugoliubov and indepen- 
dently by Kirkwood and by Born and Green has proved to be one of the 
most effective. 

In the correlation function method, the calculation of the configuration 
integral is replaced by finding a certain set of integro-differential equations 
interrelating a system of functions which characterize the mutual correlation 
in the spatial disposition of the particles. 

We stress from the very beginning that the correlation function method is 
a direct consequence of Gibbs statistics. In what follows we shall be interested 
in the spatial distribution of a system of interacting particles. 

Integrating the Gibbs distribution over all momenta, we find the expres- 
sion for the probability of a given configuration of a system of particles: 


dw, = f dw = 7-7! e-UIKT dy, dry...dry , (47.1) 
p 
where / is the configuration integral: 


I= fe-UIKT dry dry...dty « (47.2) 


If dw, is integrated over the coordinates of all the particles except one, then 
we obtain 


dr 
1) pee (fe 
dw) = + fe UIKT dry ...dty - (47.3) 


§47 THE CORRELATION FUNCTION METHOD 231 


It is obvious that dw()) represents the probability that particle no. 1 be in the 


volume element dr, for any positions of all the other N—1 particles. This 
probability can be written in the form 


py (1) )dry 
(a0) 
dw, = y A (47.4) 


where p,(r,) is the density of the probability that the particle be found in 
the volume element dr, normalized to the volume of the system: 


ao 





1 fe-URKT dry...dty . (47.5) 


We shall call the function p,(r,) the ordinary distribution function. 
Analogously, integrating the Gibbs distribution (47.1) over the coordinates 
of all the particles except the first and second, we obtain 


dr, dr (r4, %>) dr, dr 
(1,2) _ — "2 he _ P1201, £2) Gry Gry 
dw, = aac -UIKT drs. -dry iae USEF DR (47.6) 


so that 


Pi2(f1:T2) _P12_1 f UKT 
nee pe UF drz ...dry . (47.7) 


The function p15 represents the density of the probability that the first par- 
ticle is in the volume element dr, and that the second particle is simultane- 
ously in the volume element dr) normalized to the volume of the system. 

We shall call p;> the binary distribution function. Distribution functions 
of any order can be determined in an analogous way. For example, the dis- 
tribution function of the mth order characterizes the probability that the 
first particle be in the volume element dr} , the second particle in the volume 
element dr, the mth particle in the volume element dr,,, for any positions of 
the other V—m particles: 


P12..mOp--%m) _ I 
aio Bi salt ta = ou GP +1 -dry 6 (47.8) 


If we are interested in the properties of a system which depend on the posi- 
tion not of all but only several particles constituting the system, the distribu- 





|) 


232 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


tion functions p17 , play the same role as the Gibbs distribution function 
for the system as a whole. 

By means of distribution functions one can find the mean values of 
quantities which depend on the coordinates of the corresponding particles. 


For example 
dr, dr>...dr,, 


Lt), T,---Tp) = fio, ? 12,---Lp)P 12..m(1 E A) ym 


§48. Equations for correlation functions 


At first sight it may seem that finding the distribution function of mth 
order which characterizes the spatial distribution of certain particles of a sys- 
tem must be simpler than finding the Gibbs distribution, the distribution 
function of Nth order characterizing the configuration of all the particles of 
the system. 

However, it is clear that a direct determination of the distribution func- 
tions P1, P2»: Pm>--- is associated with the calculation of the configuration 
integral and hence their application in no way simplifies the problem and is no 
step forward. 

The application of the distribution functions would be of no interest if 
there were not another method of calculating them which is not associated 
with the determination of Z. 

It turns out to be possible to obtain a differential equation which must be 
satisfied by the distribution functions p,,,. To find an equation which must be 
satisfied by an ordinary function, we differentiate formula (47.5) with respect 
to the coordinates r} . Obviously, we have 


0p; (rı) iS 
—— = — — pgs ws 
ory ale ia -dry - (48.1) 
Consider in more detail the derivative 
N 
a 
oa EA 2X2 ulr-y)= È oy A) (48.2) 
1 8) 1<i<j<N 1j 


[cf. eq. (46.7)]. Here we have made use of the fact that all terms of the sum 
over į except the one referring to particle no. 1 do not depend on r} and 
reduce to zero in differentiating. 

Substituting (48.2) into (48.1), we obtain on the right-hand side the 


integral 


§48 EQUATIONS FOR CORRELATION FUNCTIONS 233 
Vf -upner 2 = 
ale ar, 2 u(r 4) dy 
du(\r,; —r;l) 
ams >) fe-UkT Lam 
eee e Sse tN ee 
IKT 5 J ar Taal 
But, by the definition (47.8), 


_ Pij(t1 1) 


LR -URT ay 5 
fe UIT dry ...dr; 1 dty41...dty z 


Hence the right-hand side can be written in the form 
X poulin -nD 
l u(r Tj 
Tee) ETE- ~ Ay ,(ty tj) dr; . 
2 


The sum over j contains N — 1 terms each of which represents an integral of 
the form 


du(Irj — rl) 
J — ən luy). 
Since the system consists of identical particles, so that u(iry —=r;l) for given 
Ir; —r;| has one and the same value and the integration is carried out over all 
[ry Tl, one can write 


1 N du(Iry —r)1) 
VS on #1 (ty 4) dry = 


_N=1 ult) —¥1) 
7 Le, uty 


N 2u(lry Hl) 


Substituting this expression into formula (48.1), we obtain 


dpi) ypu ry — 1) 


Om at on UR (433) 





234 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


Formula (48.3) relates the ordinary distribution function p] to the binary 


distribution function Pij- 
Let us find the equation satisfied by the binary function pj). Differen- 


tiating (47.7), we obtain 


Gy 42) y2 ] 
a en ae ZUKI OA me R 
ory ie ary Dy utr) rl) | dr3...dry 


-Z paor UD 
= er an ON 


el 
D y [e-UIT dr3...dt;_ 1 dry -dry ~ 


AD 


ao aay) 
ry 





2 =e 
x aS p2;jdīj ppp ialn 12) 


Consequently, we have finally 


0P12 l ðu(irı—r2l) N du(iry —¥1) 


an kr 12 ory Tape a maha 





Eq. (48.4) connects the binary distribution function with the threefold dis- 
tribution function. In the same way, one can obtain an equation connecting 
the threefold distribution function with the fourfold, the mth distribution 
function with the (m+1)th, and so on. As a result one obtains an unclosed 
system of equations each of which expresses the derivative of the distribution 
function of a given order in terms of the distribution function of the next 





order: 
ðP12..m(T1 >m) 1 0 | 
ar = — kr P 12..m ary Druitt) | 
{fa 2 T P12..m,m+1 Um +1 - (48.5) 
jam) 


Continuing this procedure we shall arrive at an equation connecting the 
distribution function of the (V—1)th order with the distribution function 


§49 EQUATION OF STATE 235 


of the Nth order, i.e. with the Gibbs distribution. Hence the problem of 
finding distribution functions of a lower order again turns out to be asso- 
ciated with the Gibbs distribution for the entire system. However, in this 
fact lies the most important feature of the equations obtained: functions of 
a higher order always enter under the integral sign with a coefficient 
~ (N/KT)(8u/dr,). In the case where the potential energy of interaction 
between two particles, u(iry—t|), decreases rapidly with increasing distance 
and becomes small at distances which exceed molecular dimensions the 
quantity du/dr, is very small for |r; —r;| > d, where d is the diameter of the 
molecule. Hence, for example, the expression for the integral in the right- 
hand side of eq. (48.4) can be estimated to order of magnitude in the follow- 
ing way: 


Nat, aw Of oe 
VS ar, P12 ~ VIN | dr, P17 | 


where [(du/dr, Priayla is taken for distances between the particles of the 
order of d. 

If the volume per particle, V/N, is large in comparison with the volume d3 
of the particle, then the coefficient d3/(VN-!) is small. Hence use can be 
made of approximate expressions for the value of the integrand, in particular 
the distribution function of the third order. This conclusion efers not only 
to the equation for the binary function, but has a general character. 


§49. The equation of state and the energy of a system of interacting particles 


The binary distribution function p;2(r,, ry), in terms of which the equa- 
tion of state can be expressed is of great importance. 
The pressure in a system is determined by formula (32.5): 


Since, as we have seen before, Z resolves into two factors, one of which, Zķin» 
depends only on the kinetic energy but not on the cc.figuration of the 
particles and, consequently, not on the volume, and the second of which, /, 
depends only on the configuration, we have 


Pah Oe T OT ER) 


wens 


236 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


We shall find the derivative of / with respect to the volume of the system. 
For this we change all linear dimensions by a factor A: 


r*> N. (49.2) 
In consequence of (49.2) we obtain 


V* >V. (49.3) 


Then 


1=23N fe-UKT dry ...dry - (49.4) 


According to (49.3) 


SZO ROON ual 


aV* aV* ddXdV* BA 3ZALV 








and, consequently, 








J o A 
SVAN] Grey) 
Differentiating (49.4) with respect to A, we obtain 
al _ 3M PN f ape BU 
a OL = kT e DA dr} ...dry 4 (49.6) 


According to the definition (46.7) 


ðU_əð 22 urr) = apy (irj—jl)u’ 3 


OA OA 
Substituting dU/0A into (49.6), we find 


sei = L ' e-U/kT 
OA ler © 3NI - FF 22 f rru e dr} ...dry , 


or 


§49 EQUATION OF STATE 237 


al 
EN 





N2 + -Uk 
ret 3NI — ses (in —rzl)u' eUT dry ...dry . 


Here we have made use of the fact that all 4N(N- 1) terms (the number of 
interacting pairs) in the double sum are identical with one another. By means 
of (49.5) and (49.1) we obtain 


LNT N2 
yV 6VI 





u'(lr1—r2l)dr dry fe~YAT dr3...dry . 
Making use of (47.7), we finally find 


2 mee 
=r a AGES r2)(lr] —r2[)u dr; dry . (49.7) 


Formula (49.7) connects the equation of state with the binary distribution 
function. 

The binary distribution function p}3(r},r2) characterizes the probability 
of a given mutual disposition of two arbitrarily chosen particles in a system. 
In isotropic phases (gases and liquids) one more simplification of eq. (49.7) 
can be made. 

Since in isotropic phases the binary function cannot depend on directions 
but only on the distance between the particles, it can be written in the form 


P12(11 12) = u(lr; —r21) . 
Hence 


_NkT N2 ' 
bea lie gya Juir =r r-ru (Ir; —r21)dr; dr . 


Introducing a new variable 
r= |r}—r/ 
one can write 


Jey -rC —rzDu'(r, —rz1)dr dry = 4a V f ulr u'e)rèdr , 
0 





238 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


so that finally 


2 co 
p= a J 2 A Suura. (49.8) 
0 


In an analogous way one can find an expression for the energy of the system: 


popp MZ -pp ðZkin kT? Of _ 3NKT , kT? al 


aT i mote we carat’ 9-9) 





where 3/VKT is the energy of a system of non-interacting particles. 
Substituting the value of / from (47.2) and using (47.7), we calculate 
01/OT to be 


al N21 
aT syara J lt Fade i201 2 )dr; drz . 
Hence 
2 co 
jg MLE Pat f u(r)u(r)r?dr . (49.10) 
2 V 3 


Thus the energy as well as the pressure are expressed in terms of the binary 
distribution function u(r). 

The binary distribution function u(r) can be calculated for gases when 
the density of particles in the system is small. 

That is, in the case of such gases the equation for the binary function 
(48.4) can be solved by a method of successive approximations. 

Indeed, in gases with a density which is not too large the mean distance 
between particles is large in comparison with their size. The estimations made 
at the end of §48 show that the coefficient of the threefold distribution 
function on the right-hand side of (48.4) is proportional to d3/(VN-!) and, 
consequently, is very small for a sufficiently rarefied gas. This allows one to 
substitute the approximate value of p;, into the integral on the right-hand 
side of (48.4) without committing any appreciable error. 

To obtain an approximate expression for p, we expand all distribution 
functions in a series in powers of the small quantity NV-! — in reality this 
expansion is carried out in powers of the ratio [d/(VN-1)]3 — and we con- 
fine ourselves in this expansion to terms of the lowest order, writing 


§49 EQUATION OF STATE 239 


N 

Pin = PID + YPI + => ear) 
zr Nia 

Praa = Pigs * 7 P33 oe (49.12) 


Substituting these series into the equation for the correlation functions and 
retaining only low powers of small quantities, one can successively determine 
the correlation functions, in particular p,5. Then the pressure, according to 
(49.8), will appear as a series in powers of VV-!. 





N2kT RANZ 
DV = NkT — IV B- 3p Bo, (49.13) 
B= [(e-YAT_1)aV, (49.14) 
Bo = $f (ud WkT_ 1 y(e-uykT_ 1)(ee kT l)dVaV'. (49.15) 


The second term in (49.13) is obviously the same as (46.18), while the third 
term gives a correction to the pressure to the next smaller order of magnitude 
(in powers of the density VV-!). In the case of rarefied gases the correlation 
function method has no special advantages over other methods of calculating 
corrections for the interaction. 

Of greater importance is the application of this method to the construction 
of a statistical theory of liquids. 

Up to now the statistical theory of liquids has been very much in the 
initial state of its development. The cause of this lies in the very character of 
thermal motion in liquids. 

Thermal motion in liquids differs from that in gases and crystals by the 
fact that for liquids the energy of the interaction of a molecule with its 
neighbours cannot be considered to be small (as in gases) or large (as in 
crystals) in comparison with the energy of thermal motion. For liquids these 
quantities are of the same order of magnitude. 

In a liquid neighbouring molecules oscillate about certain equilibrium 
positions with a relatively large amplitude. The mutual configuration of 
molecules is approximately the same as that in an elementary cell of the 
corresponding crystal. 

However, in contrast to crystals, the amplitude of these oscillations is so 
large that neighbouring molecules draw away from each other relatively 
easily and leave their equilibrium positions. The mean lifetime 7 of a mole- 
cule in a given equilibrium position is limited (it is about 10-8 sec). 


eset 


240 SYSTEMS OF INTERACTING PARTICLES Ch. 6 


In the course of time intervals which are small in comparison with this 
time 7, the oscillations of molecules in a liquid have approximately the same 
character as those in crystals. However, for times ¢ >r a molecule of a liquid 
can find itself at any point of the liquid. In this sense its motion is similar to 
that of a gas molecule. 

The character of the jumps and the frequency of the oscillations are deter- 
mined by the interaction between the molecules in the liquid. This interaction 
can vary for different liquids over a very wide range. Therefore the basic 
problem of the contemporary theory of liquids reduces to finding the 
qualitative characteristics. Such a qualitative characteristic is, in particular, 
the binary correlation function u(r). The binary correlation function char- 
acterizes the interaction of the closest neighbours. It can be determined 
experimentally from the scattering of X-rays. 

The correlation function u(r) determined in such a way is shown in fig. 
111.31 by small circles. We see that for a fixed position of a given molecule its 
closest neighbours are situated with highest probability at distances corre- 
sponding to the maxima of the curve u(r). It turns out that the positions of 
these maxima are similar to the corresponding positions in the crystal lattice 
of the same substance. It is said that short-range order is observed in the dis- 
position of atoms in a liquid and, in this sense, one speaks of the quasi- 
crystalline structure of water. It is necessary, however, to stress a basic dif- 
ference between a crystal and a liquid. In liquids the regularity in the dis- 
position of the atoms extends to the closest three or four neighbouring 
atoms. In crystals the regular disposition of the atoms extends to distances 

















3 

2 

pin) 

1 o 

oO 

O 1 2 3A 

ip) ee 
Fig. 111.31 


§49 EQUATION OF STATE 241 


which are infinite from the microscopic point of view. In other words, in 
crystals there is long-range order extending to arbitrarily large distances. As 
we have already stressed, this difference is associated with the different 
character of the thermal motion. Hence the analogy between crystals and 
liquids can be used only within very restricted limits. 

By means of certain simplifying assumptions one can obtain from the 
theory of correlation functions a form of u(r) which agrees qualitatively 
correctly with the experimental data. That is, first, instead of the true energy 
of interaction a simplified expression is introduced, 


co if d<r 


u(r) = 
ifs Wid 2r, 
where d is the diameter of the particles. This energy corresponds to the sub- 
stitution of hard (impenetrable) spheres of diameter d for the molecules. 
Second, the so-called superposition approximation is made in the function 
P123- It consists in the substitution 


P123 © P13P23 - (49.16) 


The meaning of formula (49.16) is that the interaction of particle 3 with 
particle 1 is as if particle 2 did not exist at all. In other words, the interaction 
of particle 3 with particles 1 and 2 is equal to the sum of the pair interactions 
(31) and (32). 

Although the superposition approximation cannot be substantiated theo- 
retically, it is qualitatively clear that it represents an advance over the assump- 
tion of the independence of the mutual positions of the particles in space 
(i.e. over the assumption that #123 = p1p7/3). 

In the superposition approximation, eq. (48.4) is closed. It contains only 
the binary function p;7 = y(r). 

The solution of the equation for u(r) in the approximation mentioned has 
been carried out numerically. It depends only on one parameter which in- 
cludes the quantities MV-1, T and d. The solutions obtained are shown in 
fig. III.31 (by a solid line). 

We see that the general forms of the calculated and experimental curves 
for u(r) are very similar. This means that in spite of its schematic character, 
the model of molecules as hard spheres in the superposition approximation in 
general correctly describes the character of the interaction of molecules ina 
liquid. 





Crystals 


§50. Crystal structure and thermal motion in the one-dimensional crystal 
model 


The contemporary theory of the crystalline state is based on the propo- 
sition that the structural units (atoms or molecules) of a crystal are placed 
at the points of the crystal lattice. Numerous X-ray investigations of crystals 
and a number of other data have confirmed this proposition and made it 
possible to measure the distance between atoms in a crystal lattice. In what 
follows we shall proceed from this proposition as the basis of the theory of 
the crystalline state. 

The distances between atoms in crystals are very small. They are in general 
of the same order of magnitude as the distances between atoms in molecules, 
and are sometimes exactly equal to them. For example, the distance between 
atoms in diamond (1.54X10-8 cm) is very close to that between carbon 
atoms in long chain hydrocarbon compounds (for aliphatic compounds the 
distance C—C is equal to 1.51X 10-8 cm). 

The distance between molecules in molecular crystals is only two or three 
times larger than intramolecular distances. Owing to the smallness of these 
distances the interaction between them is extremely large. In order of mag- 
nitude it corresponds to the interaction between atoms in a molecule. From 


242 


§50 CRYSTAL STRUCTURE AND THERMAL MOTION 243 


this point of view an atomic or ionic crystal can be considered as a gigantic 
molecule containing an enormous number of bound atoms. As in molecules, 
the energy of interaction between atoms in a crystal is very large in compari- 
son with the energy of thermal motion. Particles in a crystal turn out to be so 
strongly bound to each other that thermal motion cannot break the bonds. 

Thus, from the point of view of the interatomic interaction, crystals repre- 
sent the opposite limiting case to gases. It is obvious that the only possible 
form of motion of bound particles in a crystal is the vibrational motion 
about equilibrium positions. We shall assume that the amplitude of vibrations 
is very small in comparison with the distances between the atoms. Below we 
shall discuss in more detail the validity of this assumption. 

We shall calculate first of all the mean energy and heat capacity of a 
crystal whose atoms perform small vibrations about equilibrium positions 
(lattice points), proceeding from the laws of classical statistics. For this we 
can make use of the law of equipartition of energy over the degrees of free- 
dom. To each degree of freedom of the motion * there corresponds an energy 
kT. The number of vibrational degrees of freedom for a crystal containing V 
atoms is equal to 3N — 6 œ= 3N (because N is large). Hence from classical 
statistics it follows that the mean energy of thermal motion in the crystal 
is equal to 


E= 3NkT . 
The corresponding molar heat capacity is equal to 
Cy = 3Nk ~ 24.7 J mol”! . 


The heat capacity of crystals turns out to be independent of the temperature 
and of the concrete properties of crystals and is the same as the well-known 
Dulong—Petit empirical law for the heat capacity. The Dulong—Petit law yields 
relatively accurately the heat capacity of many atomic crystals at high tem- 
peratures. However, it becomes completely useless on going to low temper- 
atures. 

At low temperatures the heat capacity of all crystals decreases with de- 
creasing temperature, as was to be expected from the third law of thermo- 
dynamics. Moreover, the heat capacity of certain crystals depends on the 
temperature even at temperatures which considerably exceed room temper- 


* It should be stressed that this holds only for vibrational motion with a small ampli- 
tude, when the potential energy is expressed by a quadratic function of the displacement. 


244 CRYSTALS Ch: 7 


ature. As a characteristic example of the total inapplicability of the Dulong— 
Petit law one can quote diamond. Thus in the case of crystals we again en- 
counter the limited applicability of the law of equipartition, i.e. the limited 
applicability of classical statistics. 

The simplest attempt at applying quantum laws to the treatment of the 
heat capacity of crystals consists in the following. We consider each atom 
vibrating at a crystal lattice point as a quantum oscillator which has three 
degrees of freedom. In a crystal made of atoms of one kind all atoms are 
completely equivalent and are vibrating with the same frequency v. If it is 
assumed that the atoms oscillate independently of each other, then the 
mean energy of thermal motion £ of the entire crystal can be written in the 


form 


N 
E=2) é,, 
n=] 


where €, is the mean energy of the nth oscillator, and the summation is 
carried out over all oscillators of the crystal. Since each atom is a three- 
dimensional oscillator, then En = 3€, where e is the mean energy of the linear 
quantum oscillator given by formula (43.3). 

Thus, the mean energy of a crystal has the form 


E 3Nhv coth hv 


E 2 2kT ’ 





and the heat capacity 
3Nk (hv \? 1 
Bee (a ete so. 
era NKT sinh? (hv/2kT) Gop 


The form of the heat capacity given by formula (50.1) was discussed in 
§43 in connection with the vibrational heat capacity of molecules. At high 
temperatures (k7>hv) the heat capacity tends to the limiting value 


Cy © 3Nk . 


At low temperatures, in correspondence with the requirement of the third 
law, Cy tends to zero according to an exponential law: 


k (hv \? 
cyte (te) e-hv/kT - (50.2) 


§50 CRYSTAL STRUCTURE AND THERMAL MOTION 245 


The simplest quantum law yields qualitatively the correct behaviour of the 
heat capacity of a crystal with temperature. However, a more detailed 
comparison of formula (50.2) with experimental data for the heat capacity 
shows that (50.2) does not give all the features of the behaviour of the heat 
capacity. The heat capacity of crystals decreases with temperature not ac- 
cording to the exponential law (50.2) but according to a power law of the 
form Cy ~ TÌ. The disagreement of formula (50.2) with experiment is as- 
sociated with the error in the assumption of the independence of the oscilla- 
tions of atoms in a crystal, which was taken as the basis of the derivation of 
the formula. In reality, atoms in a crystal are so strongly bound with each 
other that.the idea that the individual motion of one atom is independent of 
the motion of other atoms is out of the question. The vibrational motion of 
atoms in a crystal has a collective character, and all the atoms in the crystal 
simultaneously take part in it. 

In order to get a clearer idea of the character of the thermal motion of 
atoms in a crystal, we shall make use of a fictitious crystal model in the form 
of a chain of atoms which are distributed along a line at equal distances from 
each other. Such a chain can be considered as a one-dimensional crystal. 
Although in nature there are no one-dimensional crystals, the consideration 
of the thermal motion in a one-dimensional crystal will allow us to elucidate 
the character of the motion in a real three-dimensional crystal. 

We nurnber the atoms in the chain in such a way that the number n may 
run over values from n= 1 up to n =N (there are all together V atoms in the 
chain). Assume that a certain atom (ion or molecule) having, say, a number 
n goes out of an equilibrium position and is displaced a distance &,, to the 
right or to the left. Then it will be subject to forces from the neighbouring 
atoms: a repulsive force on the part of the neighbour it approached, and an 
attractive force on the part of the other neighbour. Since the forces of inter- 
molecular interaction rapidly decrease with increasing distance, we can take 
into account only the interaction of the given atom with its two closest 
neighbours — the atoms with numbers n — | and n + 1. Even the next atoms, 
with numbers n + 2 and n — 2, will interact only very weakly with the atom 
considered, so that this interaction can be disregarded. 

The force acting on the nth atom due to each of its two neighbours can be 
written in the form 


dulén) 
i on dé, > 





where u(é,,) is the potential energy of the nth atom at the point &,,. For 


a 


246 CRYSTALS Ch.7 


small displacements the potential energy wu(é,,) can be expanded in a series 
in powers of a small quantity &,, and one can restrict oneself to the first terms 
of the expansion, as is always done in the theory of small oscillations: 


2 
‘= du ` aZu an 
epo lah e eh T 


Since at the point =0 the potential energy has a minimum, then at that 
point du/dé,, =O and 82u/ae? =x >O. The force acting on the particle is 


Ra KE,: 


Here it has been assumed that the neighbouring atoms having numbers 
n—1 and n+l remained at rest at their points of the crystal lattice. In 
reality, of course, this is not so. A displacement of the nth atom will lead to 
a displacement of the (7—1)th and (m+1)th atoms; the (n+1)th atom will be 
displaced under the action of a repulsive force to the right a distance „4+1, 
while the (7—1)th atom under the action of an attractive force will follow 
the mth atom and will be displaced a distance &,_;. Hence the distance 
between the mth atom and its neighbours will be changed respectively by 
Enti — En and £j —&,. The nth atom will be acted upon by a force 


Er = K(En+1 $n) u K(Ey-1 =E) = K(En+ 1 +E n-1 —2E,) 0 (50.3) 


The displacement of the (m+1)th atom will lead to a displacement of the 
(n+2)th, while the displacement of the (n—1)th atom will lead to a displace- 
ment of the (7—2)th. These atoms will in their turn act on the next neigh- 
bours, and as a result the entire chain of atoms will be set in motion. To 
investigate this motion it is sufficient to find the motion of an arbitrarily 
chosen mth atom. The equations of its motion have the form 


mọn = K(En41tEn-1 —2E,) - (50.4) 


We shall assume that the first and Vth atoms of the chain are fixed, so that 
their displacements are 


£ Ey = 0- (50.5) 


Of course, we have no special grounds for this assumption. However, from 
the general propositions of statistical physics it follows that the motion of a 


§50 CRYSTAL STRUCTURE AND THERMAL MOTION 247 


system consisting of a very large number of particles cannot depend on the 
character of the initial and boundary conditions. Hence the condition which 
we have imposed upon two atoms of a chain containing V (for N>1)atoms 
cannot have an essential effect on the motion of the entire chain. If, instead 
of the condition (50.5), we introduced the conditions 


ulk) a du(Ey) a6 
E Niche 


which would express the fact that the first and last atoms of the chain remain 
free and that the force acting on them reduces to zero, the final result for 
N>1 would not be changed at all. 

As we have already said, the departure of any atom from an equilibrium 
position gives rise to a perturbation which propagates along the chain. This 
perturbation moves from atom to atom until it reaches the last atom which 
is fixed at the end of the chain. Here the perturbation will not vanish but will 
be reflected and will propagate in the opposite direction to the other end of 
the chain. At the other end it will again be reflected, and so on. In the chain 
of atoms there will arise waves travelling in opposite directions. The super- 
position of these waves will lead to the formation of standing waves similar 
to those arising in an elastic string with fixed ends. We shall seek the particular 
solution of (50.4) in the form 


EA elwt eifan a (50.6) 


where A is the complex amplitude of the wave, and a is the distance between 
neighbouring equilibrium positions. The substitution of „p into (50.4) gives 


mw? = k(2—etifa—e-ifa) , 
whence for w we find 
w= (k/m)? [2(1—cos fa)]? = (k/m sin} fa . (50.7) 


This formula, which connects the frequency w and the wave number f, is the 
law of dispersion of waves in the chain. The wave numbers must be deter- 
mined from the boundary conditions ($0.5) which correspond to standing 
waves. The latter can be written in the form of a linear combination of the 
expression (50.6): 


248 CRYSTALS Chey 
En =A sinfx sin (wtta) =A sin fan sin (wt+a) , (50.8) 
where a is the distance between equilibrium positions, A is the amplitude of 


the wave, and æ is the phase. The conditions (50.5) at the ends of the chain 
will be satisfied if one sets sinafN = 0, or 


(50.9) 


sš 


f= 


where the number & runs over a series of integers: k = 1, 2, 3, ..., M. Thus, the 
displacement £, can be written in the form of a superposition of waves of 


the form 


Enk =Ax sin ae sin (wgtta) . (50.10) 


It follows from formula (50.10) that the appearance of N standing waves 
with different frequencies cw, (k= 1, 2, ..., N) is possible in a chain of NV 
atoms. To these M frequencies there correspond N wave numbers (50.8) or 
N different wavelengths: 


2m _ 2Na 3 
The longest of the standing wave (for k=1) has a wavelength 2Na, i.e. one 
half-wave can be fitted into the entire chain. To the values k = 1, 2, ... there 
correspond ever shorter waves such that an integer number of half-waves is 
fitted into the length of the chain. It is remarkable that all atoms of the chain 
vibrate with the same frequency (i.e. wp depends only on k but not on the 
ordinal number of the atom). 

The velocity of propagation of the waves is equal to 


l path 
K 7 sina fga 


= a eM = 2(*) PT (50.12) 


i.e. turns out to be different for different waves. For small wave numbers fg, 
i.e. for long waves, sin å fpa can be expanded in a series and we can write 
sin 4 fga x a Spa. In this case 


1 
wp “(a fat (50.13) 


§50 CRYSTAL STRUCTURE AND THERMAL MOTION 249 


1 
2 


= = K = 
v= vo aia) const. (50.14) 


In the case of very long waves, for which the inequality fpa < 1 or A >a is 
fulfilled, a very large number of atoms are oscillating almost in phase; a 
change of the phase takes place in a half wavelength which covers a large 
number of atoms. Hence the atomic structure of the lattice has no effect on 
its properties. The lattice behaves with respect to long waves as a continuous 
elastic medium. Standing waves in the lattice are equivalent to those in an 
elastic medium. The velocity of propagation of waves in the lattice, given by 
(50.14), is the same as that of elastic waves (the velocity of sound) in a 
continuous medium. 

To the shortest possible waves, Ay = 2Na/N = 2a, there corresponds the 
highest frequency: 


ny 
ay =2 oj (50.15) 


and the velocity of propagation 


2a 1 
voma A(S). (50.16) 


Comparison of formulae (50.14) and (50.16) shows that long waves propagate 
with a velocity which is somewhat larger than that of short waves, i.e. the 
phenomenon of dispersion takes place. In the intermediate frequency range 
the velocity of propagation of waves, determined by formula (50.12), depends 
on the wave number fy or the frequency wg. 

In addition to the boundary condition (50.5) use is often made of the so- 
called conditions of periodicity. If an entire infinite chain of atoms is divided 
into segments of length L each containing N atoms (L=aN) then the motion 
of all the segments must be the same. This means that the following condi- 
tions must be fulfilled: 


En = En+N 
or, upon substitution of £, from (50.6), 


elfaN =], 





250 CRYSTALS Ch. 7 


Hence 


(50.9') 


sey 


f=t 


where k takes on even integer values k = 2, 4, ..., W. Here f has positive as well 
as negative values in the interval —n/a < f< n/a. The interval between the 
neighbouring values fọ and f,;, doubles. The total number of waves remains 
the same. 

The function a(f) is shown in fig. III.32. 

The displacement of an arbitrary mth atom in the chain is given in the form 
of a superposition of displacements of the form (50.10), i.e. 


n= Dy Eng = Dy Ag sin (wpt +a) sin fran, (50.17) 
k 


where the summation is carried out over all possible values of the wave vector 
fj. Instead of an arbitrary amplitude A, we introduce the amplitude 


N-1\3 
sA Ea 


and let 


Cy sin (wp tt+a) = qx « (50.18) 





Fig. 111.32 


§50 CRYSTAL STRUCTURE AND THERMAL MOTION 251 


Then we have 


2 1 
En (za) 2 dx sin fan . (50.19) 
k 


If the values of the amplitudes q, are given, then it follows from formula 
(50.19) that the displacement of an arbitrary atom in the chain will be com- 
pletely determined. Hence the amplitudes qą can be considered as the 
generalized coordinates of the system. 

We now find the energy of the entire chain expressed in terms of the 
generalized coordinates q,. The kinetic energy of the chain is obviously 
equal to 


T=4m Ds 2 5 
n 
where the summation is carried out over all atoms of the chain. We have 


(2 åk sin gan) = 
k 


De 2 
En N-1 
oe 2 D TAK! sin fan sin fan , 

N-1 fa pa 
hence 

T=” 2 By, 2 = Ik! sin f,an sin fran . 

2 7 (N-1) 
New Ns 

Changing the order of summation, we obtain 

ite D 2a zo Gk! D, sin fan sin fyran . 

ie n 


By virtue of the orthogonality property of sine functions, the sum 
Z,, sin fpan sin fran is equal to zero, provided fy + fg’, i.e. k # ks lf fg = fkn 
then k = k' and 

N-1 


2; sin fgan sin fg'an = pa ; 
n=1 


252 CRYSTALS Ch. 7 


Hence 
T= i Di (50.20) 


The kinetic energy of the chain is expressed by the quadratic form of the 
derivatives of the coordinates q,. 

We now find the potential energy of the entire chain. From formula (50.3) 
it follows that the potential energy of the entire crystal is equal to 


U= 25 up = he D7 (EnEn) (50.21) 


where u, is the potential energy of the nth atom. 
We can convince ourselves of this by differentiating eq. (50.21). Indeed, if 
the force F,, acting on the nth atom is 


Fn 3e =~ 3 5E. 2 [(Ey 8)? +. Gy En? 
+ Cy +... + (Ey—Ey_1)?) = Kléi tint 1 —2En) » 


then from formula (50.19) we find 


Enti —& = (GZ J 2 4x {sin [f,a(n+1)] — sin fan } = 


(z JF > qk 2 cos (fpan+3) sin fya . 


Proceeding in the same way as in calculating the kinetic energy, we have 


u=5 A a> > DC POPS cos (fan +4) X 
k k 


n 


xX cos (fp an++) sin ifpa sin fia = a wa 4X 


D) AIK! Sind fga sin 3 fra X D cos (f,ant) cos(fpran+}) . 
k K 


n 


§50 CRYSTAL STRUCTURE AND THERMAL MOTION 253 
But for cosines the orthogonality condition holds: 


Nel for k=k' 
DD, cos (f,ant3) cos (fg rants) = 2 i 


0 for Neen! 


hence 
U=k D> a32 sin 3 fa)? ` 
k 
Taking into account formula (50.7), we find finally 
U=4m 2 Orda (50.22) 
k 
Thus, the total energy of the crystal is equal to 
E=4m By (ap +epaz) 5 (50.23) 
k 


Formula (50.23) has an important physical meaning: the total energy of the 
crystal is expressed in a quadratic form containing only the squares of the 
quantities q, and q, (but not their products of the form q,q,). Hence the 
quantities q, are the normal coordinates of the vibrating crystal. Each term 
in (50.23) has the form 


e= smG?towzaz) = ym(q,+4n7vi dz) ; (50.24) 


i.e. represents the energy of a linear harmonic oscillator with a mass equal to 
that of an atom oscillating with a frequency vg. The energy Æ is equal to the 
sum of the energies of such oscillators which have different frequencies vg. 
The energy of a crystal of N atoms which perform bound oscillations turns 
out to be equal to the energy of N independent harmonic oscillators with a 
set of frequencies vą determined by formula (50.7). In this sense a system of 
N atoms which perform bound oscillations is equivalent to a set of M inde- 
pendent oscillators with frequencies.y,. Instead of finding the mean energy 
of a complex system of NV bound atoms, we can seek the mean energy of a 





Se 





254 CRYSTALS Ch. 7 


much simpler equivalent system of N independent oscillators. It should be 
stressed that linear oscillators with an energy given by formula (50.24) have 
nothing in common with real atoms (with the exception of the same mass). 
Each oscillator represents one of the normal oscillations of the crystal as a 
whole. All atoms oscillating with one and the same treguency vy, take part in 
the normal oscillation of the crystal. @ 

The transition which we have made from displacements é, to normal 
coordinates q% represents a transformation which is typical for wave pro- 
cesses, and is not connected with the properties of the linear chain. The 
possibility of the transition to normal coordinates, in which the energy has 
the quadratic form (50.23), represents a general algebraic theorem. 

Real crystals made of atoms (or molecules) of the same mass m are very 
seldom encountered. Usually the crystal lattice contains particles with dif- 
ferent masses and different chemical natures. The latter leads to a change in 
the law of interaction between neighbouring atoms, so that the quantity x 
has different values at different points of the crystal. 

Without taking into account the above effect we shall consider only the 
effect of the difference in the masses of the atoms on the character of the 
motion in the case of the one-dimensional-chain model. 

Let atoms with masses 77, and m having one and the same physical nature 
(i.e. with the same values of the quasi-elastic constant) be placed alternately 
at equal distances in a chain. The equations of motion of the chain will now 
be written in the form 


my En = K(2E Mn +1 —Nn-1) , (50.25) 
Mn Fa =K (2n En +1 —§n1) > (50.26) 
where é, and 7,, are the displacements of the atoms with masses my and m3 
respectively. 
Analogously to (50.6), we can seek the solution of eqs. (50.25) and (50.26) 
which satisfies the boundary conditions (50.5) in the form 
Enf = A eifan giwt h 


Nf = B elfan elt | 


The substitution of these expressions into (50.25) and (50.26) leads to the 
system of algebraic equations 


§50 CRYSTAL STRUCTURE AND THERMAL MOTION 


N 
r 
wn 


(—w?2m+2k)A = 2k cos(fa)B , 
(—w?2m7+2k)B = 2k cos(fa)A . 


Eliminating the amplitudes, one can obtain the following expression for w?: 


5) Pa yh 
o= al jadh; 42 ) asin a (50.27) 
m m m m mm 


If w is written as a function of f, then depending on the sign in front of the 
root one obtains the two branches shown in fig. II1.33. For small f, expanding 
the square root in (50.28) in a series in powers of f, we obtain 





“eet 

ii “| a | Ga), ore) 

m el eee (50.29) 
nar my mM j : 


The branch ‘w_, called the acoustic branch, corresponds to w > 0 as f> 0. 
Its behaviour does not differ significantly from that of the dispersion curve 
(50.13) for a chain of identical atoms. 

The second branch of frequencies, which is absent in a chain of identical 
atoms, shows a completely different behaviour (the upper curve in fig. 11.33). 
As f> 0, œw, tends to the constant limit (50.29). This branch of frequencies is 


Optical w 
branch 


N 


Acoustic 
branch 


l 

















-ea [6] T/2a 


faa 


Fig. 111.33 


a> 


Ss 





256 CRYSTALS Ch. 7 


called the optical branch. For given f the frequency of the waves of the 
optical branch is much higher than that of the acoustic branch. 

For small f it is easy to determine the ratio of the wave amplitudes of the 
optical branch: 


A = 
Byn 50.30) 
B ( 


The minus sign in the above formula shows that particles with masses 77 
and m, which are involved in optical waves are moving towards each other. 

In acoustic waves neighbouring particles with different masses are moving 
in the same direction. 

We see.that the difference in the masses of the atoms change fundamentally 
the character of thermal motion in the crystal. 

Consider the case when at the points of the crystal lattice there are 
charged ions instead of neutral atoms. The optical branch corresponds to 
motion of the ions towards each other. Hence optical waves give rise to 
variations in the polarization of the crystal, and it is optical waves that have 
an effect on electromagnetic processes in crystals. Their name is associated 
with this fact. 


§51. Long waves in a three-dimensional crystal 


Thermal motion in a three-dimensional crystal has, in general, the same 
character as in the one-dimensional model. A displacement of an arbitrary 
atom from the equilibrium position in the lattice is transmitted to its closest 
neighbours in three dimensions. Their displacements will, in their turn, give 
rise to displacements of other atoms, and an elastic wave propagating in three 
dimensions will arise in the crystal. As a result of the reflection of the elastic 
waves from the faces of the crystal a system of standing waves will be estab- 
lished in the crystal. In the same approximation as for the one-dimensional 
model (disregarding the third powers of displacements) the energy of a 
crystal can be written in normal coordinates in the form 


a2 I) 
må; 4n4+mvzq 

k ktk 

E= yD) (ae os) A A 
7 7 (51.1) 
where the summation is carried out over all possible wave numbers. The 
bound oscillations of atoms in the three-dimensional crystal are equivalent to 
a set of 3N independent linear oscillators with the natural frequencies vg. 


§51 LONG WAVES 257 


The determination of the natural frequencies for the three-dimensional 
crystal presents very great mathematical difficulties. Hence in order to find 
the thermodynamic functions of a crystal it is necessary to make further 
simplifying assumptions in addition to the proportionality of the forces to 
the first power of the displacements. First, in considering thermal waves in 
a crystal we shall confine ourselves to the case of long waves (A>a). As we 
have seen in the preceding paragraph, in the case of long waves very large 
groups of atoms are oscillating in phase and one can disregard the discrete 
atomic structure of the crystal. Similarly in the case of long waves in a three- 
dimensional crystal one can digress from the discrete structure and consider 
the crystal as a continuous elastic medium. Second, we shall disregard the 
anisotropy of the crystal and shall consider it as an isotropic elastic medium. 
In the isotropic elastic medium the thermal perturbations of the crystal form 
a system of standing waves. In a three-dimensional elastic medium the wave 
number f is equal to 


f= CAA , (51.2) 


where f}, f2 and f3 are quantities which characterize respectively the propa- 
gation of the wave in three mutually perpendicular directions. 

In order that a condition of the type (50.5), i.e. the condition of reflec- 
tion of elastic waves from the faces of the crystal may be fulfilled, the wave 
numbers must satisfy the following conditions: 


Tki nka 1tk3 
=a ae aay Bee san (51.3) 


where k}, kz and ką are integers (1, 2, ..., N). The frequency of long waves is 
connected with the wave numbers by a relation which represents a direct 
generalization of formula (50.13). 

In contrast to the one-dimensional model, in a three-dimensional isotropic 
elastic medium the propagation of three elastic waves is possible: one longi- 
tudinal wave (in which displacements take place in the direction of propaga- 
tion of the wave) and two transverse waves (in which displacements are 
perpendicular to the direction of propagation). The velocities of propagation 
cı and cį of the longitudinal and transverse waves are different. Hence, 
instead of (50.12), in the three-dimensional case one must write 


2nv, = cif , 2m, Ec, 


258 CRYSTALS Ch. 7 


where p) and v, are the frequencies of the longitudinal and transverse elastic 


waves, and c and c, are their velocities. 

In what follows we shall need to know the number of elastic waves whose 
frequency lies in the interval between vı and vı + dy, and vį and v, + dy, 
respectively. The calculation of this quantity in no way differs from the cal- 
culations in §38 of Part I. 

Indeed, in §38 of Part I the number of travelling waves in a cavity of 
volume V has been calculated. The number of standing waves in a crystal can 
be found in a completely analogous way. The only difference from the cal- 
culation carried out in §38 of Part I lies in the fact that in a crystal wave 
vectors are determined by formula (51.3) which differs from (38.11) of Part I 
by the absence of the factor 2. On the other hand, here the integers ky, k2, k3 
take on only positive values, whereas in §38 of Part I they take on negative as 
well as positive values. 

As a result, for the number of longitudinal waves with a frequency be- 
tween p; and y) + dy one obtains the formula 


4nV 
g(4) dy, = dy, (51.4) 
z 


which is identical with formula (38.22) of Part I *. 
For the number of transverse waves with a frequency between v, and 
v, + dv; we find analogously 


_,41V 2 
&(Y,)dy, = 2 P vdv > 


t 
where the factor 2 appears because in an elastic medium there are two trans- 
verse waves with one and the same frequency v,. The total number of elastic 
waves whose frequency lies between v and v+ dv is obviously equal to 


* ft should be noted for what follows that the displacement represented in formula 
(50.18) in the form of a standing wave can be written in the form of a superposition of 
two travelling waves. Reproducing the calculations of §38 of Part I one can write ¢ in 
the form 


3 
r=N72 D hi elrsqr eif). (51.5) 
i jp 


§51 LONG WAVES 259 


mo 
s)dv= anv (5+ 5 oa, (51.6) 


GT A 


Since we have compared each wave with a frequency vg to an oscillator 
oscillating with the same frequency, the quantity g(v)dy which we have cal- 
culated represents the number of oscillators whose frequency lies between v 
and v + dv. If the crystal were of infinite size and contained an infinitely 
large number of atoms, the number of possible frequencies or oscillators 
would also be infinite. In reality, however, it is equal to 3V. Therefore we 
can write 


3N = 27 g(r), (51.7) 


where the summation is carried out over all possible frequencies. 

We have established the form of the function g(v) in the range of long 
waves or small frequencies, in which it can be assumed to be a continuous 
function of the argument v given by formula (51.6). However, in the range 
of high frequencies the form of the spectral function is unknown and depends 
on the concrete structure of a given crystal. 

Debye proposed a method of calculating the thermodynamic functions 
of crystals which is, in essence, based on a certain interpolation. 

Namely, the spectral function g(v) is assumed to have the form (51.6) 
over the entire frequency range, and in the entire frequency range the sum- 
mation is replaced by integration. However, the integration is carried out up 
to a certain limiting frequency Vmax which is expressed in terms of the 
number of particles in the crystal by means of the condition (51.7). Thus, 
the spectral function g(v) is assumed to have the form 


capa a vsv 


3 3 max > 
ev) = | a St (51.8) 
0, DAV 
This gives 
Pmax 9 n? 7 
3N= f 4nV 4 2) ,2¢=4ny(1+2\22 , (51.9) 
Kage Bea SN | ES 
0 Sine SE Cie St 





I 
i 
+ 
F 


260 CRYSTALS Cha 


whence 
9N Gey \5 
1“t 
= — ——_~] . Sa 
Pmax (27 2¢3 =) ( 0) 


The largest, or limiting, frequency Vmax turns out to depend only on 
quantities which can be measured experimentally — the velocities of sound 
c and c, — and to be proportional to the density of the crystal (V/V)3. 

The introduction of such a “cut-off” spectral function (51.8) leads to 
expressions for thermodynamic functions which in the limiting cases of low 
and high temperatures go over into accurate expressions, and in the range of 
intermediate temperatures have the character of interpolation formulae. 

By means of the expression for the limiting frequency Vmax determined 
by formula (51.10) the spectral function x(g) can be rewritten in the more 
compact form 


goir = 2 v2dp . (51.11) 


V max 


§52. The partition function of a crystal 


We have established in the preceding paragraphs that the thermal motion 
in a crystal containing M atoms is described by a set of 3N independent oscil- 
lators whose frequencies lie between zero and Vmax. In order to find the 
partition function of the entire crystal it is necessary to find the partition 
function of a system consisting of 3M independent oscillators. Since they are 
independent, we can, obviously, write this function in the form of the 
product of the partition functions of all the oscillators, i.e. 


3N 
z= z, (52.1) 
kl 


where Z is the partition function of the crystal, and z, is the partition func- 
tion of the individual kth oscillator. It should be stressed that the oscillator is 
not an individual atom but characterizes a definite oscillation of the crystal 
as a whole. Hence in (52.1) one should not carry out the division by (3/)!, 
as should be done in the case of 3N identical independent particles. We have 


§52 THE PARTITION FUNCTION OF A CRYSTAL 261 


calculated the partition function zy, of a quantum oscillator in §43. Taking 
the logarithm of (52.1) and substituting (43.2) into it, we find 


3N 3N (hv, /2kT) 
exp (—hvg/2k 
InZ= D ma, D ia e AE $2.2) 
il k a [1—exp (—hy;/kT)] ( 


To calculate the sum (52.2) it is necessary to know all the possible fre- 
quencies v, of the crystal. 

However, as we have already mentioned, this problem is still unsolved. 
Therefore we shall confine ourselves to the Debye approximation and shall 
replace the summation by the integration over ‘cut-off’ spectrum (51.11). 
This gives 


p 
yas exp(—hv/2kT) 2 
InZ= J In [I—expChv/kT)] g(v)dv = 





Vmax 3 V max 
=." f Ti a f inG-e/kr) vdv. (52.3) 
Pmax 0 P max 0 


To calculate the integrals in formula (52.3) we introduce a new variable 
x = hv/kT , (52.4) 
and the characteristic temperature 0, of the crystal: 
06 = hvmax/ K> 


which is analogous to the characteristic temperature introduced in §43. We 
obtain 


9./T 


ONO. T\3 

= — c — 2 EAR 

InZ ar on (5-) f x? In(1—e~*)dx . (52.5) 
0 


The calculation of the above integral can be carried out only for low and 
high temperatures. We shall understand low temperatures to be temperatures 
which are considerably lower than the characteristic temperature 0, of the 
crystal. For T<@, the limit of the integral can be replaced by infinity, since 
the integrand is very small for any large values of the argument x. This gives 





262 CRYSTALS Ghee? 


6./T oo 
ff x2 In(1—e-*) dx ~ f x2 In(1—e-*)dx . (52.6) 
0 0 
This integral (52.6) is calculated in Appendix IV. Substituting its value into 
(52.5), we have 


m= (52.7) 





sm is 


9NO@. náN/TX 
(a) 
The substitution of infinity for the limit of the integral (52.6) has an impor- 
tant physical meaning. It shows that at T <0, only vibrations with small 
frequencies v are excited in the crystal (i.e. the small values of x are essential). 
For large frequencies (large x) the integrand reduces to zero, and the corre- 
sponding frequencies give no contribution to the value of Z. This justifies the 
approximation made in the preceding paragraph — the replacement of the 
discrete crystal by a continuous elastic medium in which only oscillations with 
small values of v are excited. 
For high temperatures T>, the limit of the integral will be a small 
quantity. Hence x in the integrand is small, and the integrand can be expanded 
in a series in powers of x. In this case we have 


In(1—e*) = Inx , 


so that 
9./T 9 ./T OA 0 6.\3 
pis Taji 2) c L( 2) 
2 —e-*x æ 2 =e (ee — oo ee 
f x4 In(l1—e~*)dx ff x4 Inxdx Ae Iny o\T)] ` 
0 0 
(52.8) 
Substituting formula (52.8) into (52.5), we find 
å Oc Spel (Ze 
InZ=— 3N In +N- N 7): (52.9) 


By means of the expressions (52.7) and (52.9) one can find the thermo- 
dynamic functions of a crystal at high and low temperatures. 


§53 THERMODYNAMIC FUNCTIONS OF A CRYSTAL 263 
§53. The thermodynamic functions of a crystal 


We shall first of all calculate the energy and heat capacity of a crystal at 
low temperatures. At low temperatures (7<0,) the energy is equal to 


ðlnZ 


3a4 NkT* 
oT i 


E= kT Ei 
c 5 0? 


= 2 NkO (53.1) 


The first term in formula (53.1) represents the energy of the crystal as T > 0, 
i.e. the zero-point energy. The second term shows that the energy of the 
crystal increases rapidly (as T4) with increasing temperature. 

The heat capacity of the crystal at low temperatures is given by the 
formula 





_(dE\ _ 1204Nk/(T \3 
Cy = (Fr), a (z) i (53.2) 


It turns out to be proportional to the cube of the absolute temperature. 
A characteristic feature of the expressions (53.1) and (53.2) is the fact that 
they involve a material constant of the crystal: its characteristic temperature 
0. Hence at low temperatures different crystals possess different heat capac- 
ities (which are smaller the higher 6,). 
At high temperatures (T>@,) the energy and heat capacity are equal res- 
pectively to 


ð lnZ 





E=kT2 aa NE 3NkO, © 3NKT , (53.3) 
-(92) _ 
C= (37) 3Nk . (53.4) 


As was to be expected, the values of £ and Cy agree with those obtained 
from the law of equipartition of energy over degrees of freedom. They do not 
depend on the material constants of the crystal and are universal quantities. 
The independence of the energy and heat capacity on the material constants 
is due to the fact that at sufficiently high temperatures the energy of the 
crystal turns out to be independent of the vibrational frequencies of the 
crystal. This fact allows one to understand why formulae (53.3) and (53.4) 
turn out to be correct in spite of the fact that they are derived on the basis 
of an undoubtedly incorrect assumption. Indeed, in deriving them it was 


264 CRYSTALS Che 7, 


assumed that the basic role is played by small frequencies (long waves), for 
which the discrete crystal can be considered as a continuous elastic medium. 
For T> ð, in addition to low frequencies there must also be excited in the 
crystal high frequencies, which will give a considerable contribution to the 
partition function. Hence the approximate law of frequency distribution 
(51.8) will no longer be applicable. However, in the classical approximation, 
which is valid for sufficiently high temperatures, the value of Z and, conse- 
quently, also that of the energy of the crystal depend neither on the fre- 
quencies themselves nor on the character of their distribution. 

In the intermediate range of temperatures T~@, the energy and heat 
capacity are expressed by more complex formulae, which are obtained by 
numerical integration of formula (52.5). Since the width of the transitional 
range is not large, they are of no special interest. 

It should be noted that, although we have calculated the heat capacity at 
constant volume, the expression obtained agrees to a high degree of accuracy 
with the heat capacity at constant pressure, because in a solid body the two 
heat capacities are practically the same. 

We now find the entropy S of the crystal. For low temperatures (T<@,) 
from (53.1) and (52.7) we get 


_E _ 4n4 T \3 
S= 5+ kinz=" ne (7) à (53.5) 


For high temperatures (T>90 ,) from (53.3) and (52.9) we obtain 


s=£4k1nZ=3Nk In + 4k. (53.6) 
T a. 


Formula (53.6) agrees with the purely thermodynamic expression for the 
entropy and differs from the latter only in the fact that it does not contain 
an indefinite constant entropy. Formula (53.5) shows that as T>0O the 
entropy of an atomic crystal tends to zero as the cube of the temperature, 
which is in complete agreement with the requirements of the third law of 
thermodynamics, just as is the decrease of the heat capacity as T> 0. 

Finally, we find the free energy of the crystal. 

From the conditions (52.7) and (52.9) we obtain 





F=—kTinz~-N formal, <n (53.7) 


mkT /T\3  9NkKe. 
— + _Ů 
5 ca) 8 


§54 COMPARISON OF THEORY WITH EXPERIMENT 


w 
Le 
Ww 


F ~ — 3NkT ln a — NKT for Tlos (53.8) 
c 


Proceeding from the free energy one can obtain the equation of state of 
the crystal, which has the following form for T> 0e: 





L _[3FE\ __3NKT 8 
2 ƏV jr a V: 


Since p turns out to be expressed in terms of the quantity 00./dV and the 
latter cannot be calculated theoretically or found from sufficiently simple 
experiments, this equation of state is of no great practical importance. 

It is interesting to find the law of distribution of energy levels in a solid 
body or, more precisely, the spacing between the energy levels. If Q(e) is the 
number of levels per unit interval of energy, then the spacing between 
neighbouring energy levels is obviously equal to 


D(e) = Q! (e). 
By means of formulae (24.10) and (53.5) one can write 
D(e) = e70 = e~S/k = exp [$74 M(T/0,.)3] . 


The spacing between levels decreases with decreasing temperature as well as 
with decreasing number of particles. For temperatures which are very close to 
the absolute zero it turns out, in accordance with what was said in §35, to 
be equal to KT even for a macroscopic body. 


§54. Comparison of theory with experiment 


For the practical use of the expressions obtained and the comparison of 
calculated values with experimental data it is necessary to know the 
characteristic temperature 0, of the crystal. 

From the definition and formula (51.10) we find 


1 SPIT 

0 - Ape A (2N ( ict y (54.1) 

c kE §k\4nVv ET) Le s 
2c; TEF 








4 
' 


266 CRYSTALS Ch. 7 


Formula (54.1) contains in addition to numerical values and universal 
constants, two quantities which are determined by the properties of the 
crystal: the density N/V and the velocity of sound in the crystal. These 
quantities have been measured for many crystals. Since the velocity of sound, 
according to (50.14), is inversely proportional to the square root of the mass 
of the atoms, it is particularly large for very hard crystals made’of light 
atoms. For such crystals the characteristic temperature is particularly high. 

Table 8 gives the characteristic temperatures of certain crystal. It follows 
from this table that for crystals of the type lead and common salt room 
temperature (300 K) and higher temperatures (up to 1000 K) are relatively, 
though not very, high. Hence for such crystals the departures from classical 
laws and, in particular, from the classical value of the heat capacity are not 
very large in this region. Without much error, it can be assumed that their 
heat capacity is equal to 24 J/mol. However, at temperatures which are lower 
than 100K the departure from this value becomes very appreciable for all 
these crystals. 








Table 8 
Crystal 8c (K) Crystal c (K) 
Pb 88 Cu 315 
I 106 Be 1000 
benzene 150 AI 398 
Na 172 Fe 453 
Ag 215 diamond 1860 


NaCl 281 








The situation is different for crystals with a high characteristic tempera- 
ture, particularly in the case of diamond. For the latter room temperature 
appears to be low, and the applicability of classical laws is out of the 
question. The heat capacity of diamond already follows the T3 law at room 
temperature. 

Measurements of the variation of the heat capacity with temperature, 
which have been carried out for a very large number of crystals, show that 
the theory is in very good agreement with experiment. At temperatures 
T <6, the heat capacities indeed follow the T3 law and tend to zero as 
T—> 0. At T~ 4, a gradual transition to the classical (constant) value of the 
heat capacity takes place. The complete form of the curve of the heat 
capacity of some crystals is shown in fig. II1.34. The small circles on 
the curve show the measured values of the heat capacity of different crystals. 


§54 COMPARISON OF THEORY WITH EXPERIMENT 267 














25 
a 20 
© Ag 
i EMS Al 
= C(graphite) 
E 10 Al,O3 


KCL 


u 


O 


0.5 1.0 15 20 
T/®, 


Fig. [11.34 


They are all presented on the universal scale 7/0.. The agreement of theory 
with experiment proves to be very good. 

It should, however, be kept in mind that the rough theory developed holds 
only for crystals made of particles whose internal structure can be dis- 
regarded. This means that one can disregard the effect of the temperature 
on the state of the particles. In most cases this condition is satisfied by 
crystals made of atoms. In atoms the spacing between the ground state and 
the first excited state is as a rule large in comparison with kT, and thermal 
motion cannot have an effect on the state of atoms. Hence their internal 
energy does not depend on the temperature and they give no contribution 
to the heat capacity. However, for certain atoms the lowest electronic levels 
lie very close to each other. Thus, for example, for the ions of gadolinium 
which enter into crystalline gadolinium sulphate the lowest energy level 
consists of eight sublevels which are spaced at distances corresponding in the 
temperature scale to the characteristic temperature 1.6 K. At very low tem- 
peratures, Tœ 7K, in accordance with the results of §40, there appears an 
additional heat capacity superposed on the heat capacity of the crystal 
lattice. Since at such low temperatures the heat capacity of the crystal 
lattice is already very small, the increase of the heat capacity is very sharp. 
At T= 1.6K the additional heat capacity exceeds that of the crystal lattice 
by a factor of almost 500. As the temperature decreases further the heat 
capacity of the system falls to zero. 

In the case of crystals made of complex molecules the internal structure of 
the particles cannot, as a rule, be disregarded. In the first approximation one 


H 
| 
f 
| 





268 CRYSTALS Chez 


can disregard the effect of the vibration of a molecule in the crystal lattice 
on its internal thermal motion. The molecule as a whole is vibrating in the 
crystal lattice, while vibrations of individual atomic groups take place inside 
it. In certain cases the internal motion of the molecule represents a free rota- 
tion. Thus, for example, Hz molecules must be considered as el freely 
in the hydrogen crystal. The rotating Hj molecules take part simultaneously 
in the thermal vibrations of the lattice. Disregarding the interaction of the 
internal motion in molecules with the motion of the molecule as a whole, the 
total energy £ of the cyrstal can be written in the form 


E= Ena + Eintem > 


where Ept is the mean energy of the vibrations of the crystal lattice, and 
Eintem is the mean internal energy of the molecule, which we have calculated 
in ch= 5. Correspondingly, the heat capacity of the crystal can be written in 


the form 


S2 CY at y CY intem i 

The contribution of the internal motion to the heat capacity can be very 
substantial in certain cases. Thus, for example, the heat capacity of intra- 
molecular vibrations in benzene amounts to about 20% of the heat capacity 
of the lattice at T+ 150 K and reaches about 80% of the latter at T ~ 270 K. 
Hence in calculating the heat capacity of complex crystals it is necessary to 
take into account the contribution of the internal motion, particularly at high 
temperatures. Good agreement of the theory with experiment justifies the 
simplifications made. 

However, in a number of cases disagreement, though not considerable, is 
observed between theory and experiment. At high temperatures vibrational 
amplitudes become large, and it is no longer justified to disregard the squares 
of displacements in the expression for quasi-elastic forces *. The vibrational 
motion loses its simple harmonic character. The error which arises as a result 
of the assumption of the isotropy of the elastic medium is relatively negligible. 
The error arising because the discrete character of the crystal is disregarded 
turns out to be more considerable. The effect of the distribution of high 
frequencies, which we have disregarded in determining the number of the 


* In this connection it should be noted that, as can be shown by calculation, a crystal 
for which the forces were exactly proportional to the displacements would have a 
coefficient of thermal expansion equal to zero. 


§54 COMPARISON OF THEORY WITH EXPERIMENT 269 


normal vibrations of the crystal, shows up in the fact that for hard crystals 
the characteristic temperature 0, turns out not to be a constant but a func- 
tion of the temperature of the crystal. For example, in the case of lithium its 
value changes from 330K at a temperature of the crystal equal to 20K to 
410 K at a temperature of 120 K. A similar phenomenon occurs in the case of 
diamond. All the errors mentioned are so negligible that they have only been 
noticed relatively recently because of the increased accuracy of measurements. 

It should also be noted that the existence of anomalies at low tempera- 
tures, which are similar to those existing for gadolinium, can lead to seeming 
contradictions with the third law of thermodynamics. Usually the value of 
the heat capacity as T > 0 is found by an extrapolation from values which are 
measured at higher temperatures. If this extrapolation is carried out from a 
certain temperature which is higher than that at which the increase of the heat 
capacity occurs, then a considerable error arises. The value of JZ T-!CydT 
obtained from the extrapolation can differ appreciably from the experimental 
value of the entropy at a high temperature found from other data. Hence in 
order to find the true values of thermodynamic functions it is advisable to 
carry out the measurements of the heat capacity at temperatures as low as 
possible. 


| 
{i 


L] 





The Theory of Fluctuations 


§55. Small fluctuations in macroscopic systems 


In the preceding discussions we have more than once pointed to a differ- 
ence between the statistical and purely thermodynamic ideas on the nature 
of thermal processes. From the laws of statistical physics the existence of 
fluctuations necessarily follows. A system undergoing a fluctuation can pass 
spontaneously over from a more probable state into one of the less probable 
states. In this case the trend of the process is the reverse of that for which an 
increase in entropy occurs. 

The probability of fluctuations in a closed system can be calculated by 
means of the Boltzmann formula. Simple estimations carried out by means of 
this formula, as well as the general considerations discussed in §36, show that 
the probability of any appreciable fluctuations in a system which contains a 
large number of particles is extremely low. The phenomenon of fluctuations 
can in practice be observed in two cases: (1) when the dimensions of the 
system are sufficiently small; in this case fluctuations will occur often and 
their scale will be relatively large; (2) when the dimensions of the system are 
not small, but rather small fluctuations occur. Such small fluctuations may 
occur often, but the departure of the system from an equilibrium state will 
be relatively small. In this chapter we shall consider both of these cases of 


fluctuations. 


270 


§55 SMALL FLUCTUATIONS 271 


In order to estimate correctly the role which the investigations of fluc- 
tuations played in the development of molecular-statistical concepts, it 
should be kept in mind that the existence of fluctuations was predicted 
theoretically at the time when the second law of thermodynamics seemed 
to many to be one of the dogmas of physics. The representatives of the so- 
called school of energeticians denied the existence of material atoms and 
molecules. Statistical physics, in which the laws of classical mechanics were 
unified with statistical laws, seemed to be internally inconsistent, and was 
accepted with distrust by many physicists. Hence the discovery of numerous 
examples of fluctuation processes was a brilliant confirmation of the laws of 
statistical physics and was one of the most important events in the final 
establishment of molecular theory. In the studies by Einstein and Smolu- 
chowski it was shown that a number of physical processes, which had been 
known for a long time, are due to fluctuation phenomena, and a quantitative 
theory of these processes was developed which proved to be in excellent 
agreement with experimental facts. The significance of these discoveries can 
best be expressed in the words of Smoluchowski himself *: 

“At present we do not regard the dogmas of physics with the same esteem 
as before. Great changes concerning the problem of the significance of kinetic 
atomistics and thermodynamics have taken place. They are associated with 
the fact that we have only recently managed to interpret, on the basis of 
kinetic theory, facts which were known a long time ago; for example, 
Brownian motion discovered as far back as 1827, the phenomenon of critical 
opalescence discovered more than 20 years ago, the well-known blue colora- 
tion of the sky, and so on. The new thing which we encounter in these inter- 
pretations, and which is in contradiction with every-day established notions, 
lies in the fact that they were the first to take seriously into account the 
Maxwell velocity distribution law. As a result they were the first to consider 
heat as a process of motion, whereas before this the concept of the nature of 
heat was usually considered as a kind of poetic simile.” 

We shall begin the consideration of fluctuation processes with the second 
case, i.e. the case of systems whose dimensions are large. 

We shall below expound the general theory of small fluctuations occurring 
in an arbitrary macroscopic system. We take a closed system which is in a 
state of statistical equilibrium having an entropy Sg. We now assume that the 
state of the system changes in such a way that it passes over into a non- 
equilibrium state in which its entropy is equal to S. We assume that the change 
in the state of the system can be characterized by a change in a certain internal 


* M.Smoluchowski, Phys. Z. 13 (1912) 1059. 


=a 


a 


THEORY OF FLUCTUATIONS Ch. 8 


N 
~ 
N 


parameter ¢ whose value depends on the state of the entire system. In an 
equilibrium state the parameter € has a value €=£,, whereas in a non- 
equilibrium state its value differs from ġġ. 

As an example of the parameter — one can take the density p of a gas 
confined in a closed, thermally insulated container. In an equilibrium state 
the density is constant over the entire volume of the container, i.e. Eo [Xi = 
const. As a result of a fluctuation the system can spontaneously go over into a 
non-equilibrium state with a variable density £ = p(x). Other examples will be 
discussed later. 

The entropy of the system will be a function of the parameter £, so that 
one can write that S = S(¢). In an equilibrium state Sọ = S(&,)). The probability 
that the closed system considered will get into a state characterized by a 
value of the parameter & which lies in the interval between & and & + dé can be 
found by means of the Boltzmann formula: 


S(E)—St 
dw = const - exp (re) dé = const - exp (43) dé, (55.1) 


where the constant is determined by the normalization condition *. The value 
of the change in the entropy is obviously negative. 

The applications of formula (55.1) to actual cases of fluctuations will be 
considered in the following section. Formula (55.1) is applicable to fluctua- 
tions in a system with a constant energy. 

However, one very often has to consider fluctuations occurring not in a 
closed, but a quasi-closed system which constitutes a small part of a closed 
system. Such a quasi-closed system can be considered as a subsystem placed 
in a reservoir with a constant temperature Tọ. We shall assume that fluctua- 
tions occur only in the subsystem, while the reservoir is always in an 
equilibrium state. The state of the subsystem will be characterized by the 
value of a certain external parameter À. In going over from an equilibrium 
state to a non-equilibrium state the parameter A changes from A, to À. As À 
changes, the values of the thermodynamic quantities characterizing the sub- 
system also change. We shall assume that the variations of the macroscopic 
parameter A are sufficiently slow, so that at every instant an equilibrium 
statistical distribution exists in the subsystem. Then we can consider that the 
thermodynamic quantities in the subsystem are interrelated by the usual 


* Strictly speaking, the constant in (55.1) also depends on the parameter £. It can, 
however, be shown that for a system containing a sufficiently large number of particles 
the dependence on ¢ of the factor which stands in front of the exponential is insignificant 
in comparison with the dependence on the exponential. 


§55 SMALL FLUCTUATIONS 


w 
~ 
ww 


equilibrium relations. The process of the transition from an equilibrium state 
into a non-equilibrium state in a subsystem placed in a reservoir can be con- 
sidered as a transition performed under the action of a certain external 
source of work. As the parameter \ changes by an amount AA = À — Xp the 
source does work AW(A) on the subsystem. 

We now write an expression for the probability that the subsystem will go 
over into a state with a value of À between A and A+ AA while the reservoir 
remains in an equilibrium state. Since the reservoir and subsystem together 
constitute a closed system, formula (55.1) is applicable to them. However, in 
it the change in the entropy must be written in the form 


AS = ASp + AS’, 


where AS" is the change in the entropy of the subsystem. Then the probability 
that the subsystem will go over into a state with A lying in the interval À, 


A+ dd under the action of the external source of work is given by the 
formula 


AS )+As' 
|] Ms 


dw = const - exp ( k 


(55.2) 


But, by virtue of our assumption of the slowness of the variation of the 


macroscopic parameters, one can write for AS’ the usual equilibrium expres- 
sion: 


AE’ + pyAV'— AW 


As’ 4 
To 


(55.3) 


where Ty and pg are the equilibrium temperature and equilibrium pressure 
of the system (which are equal to the corresponding quantities of the reser- 
voir), and &’ and V’ are the energy and volume of the subsystem. (In the last 
formula it is seen clearly that AW represents the work done by an external 
source and not by the reservoir. The work done by the reservoir is equal to 
—p AV ’.) Further, 


Ès AE, +P 
ASO a 
To 


But by virtue of the fact that the system is closed (reservoir+subsystem) the 
total volume of the system remains constant, so that 


274 THEORY OF FLUCTUATIONS Ch. 8 
AV, =—AV'. 


The energy conservation law gives 


AE’ + AE) =0, 
hence 
át , AWA) 
a ae any oa (55.4) 
Substituting (55.4) into (55.2), we find 
dw = const - exp (- AWO) dÀ. (55.5) 
kT, 


Thus, in the most general case, it can be said that the work which must be 
done on a macroscopic system in order to change the parameter A, character- 
izing the state of the system, by an amount Ad is a measure of the probability 
of small fluctuations in the system. This does not mean, however, that a 
system can undergo a fluctuation only when real work from without is done 
on it. This is particularly clearly seen from the example of a closed system on 
which no work is done. The work AW is only a quantitative characteristic of a 
fluctuation. The work AW can be written as the change in the potential energy 
as the system is displaced in a certain imaginary (and sometimes also real) 
field of force. Denoting the potential of this field by u(A), we have 


AW = u(A) — ug) = uA) , 


if u(Ag) is chosen as the zero-point potential energy. Then formula (55.5) 
can be written in the form 


dw = const - exp (A) dÀ = w(A)da . (55.6) 
kT, 


We arrive thus at a formula which is an analogue of the Boltzmann formula. 
In what follows we shall see that this analogy has a completely clear meaning. 

To calculate the probability of a fluctuation according to formulae (55.5) 
or (55.6) it is necessary in each individual case to find the work done or the 
change in the potential energy which takes place in the fluctuation process. 


§55 SMALL FLUCTUATIONS 275 


By virtue of the smallness of the fluctuations the expression for u(A) can be 
expanded in a power series of the small parameter A — Ag, and one can con- 
fine oneself to the first terms of the expansion: 


UA) = u'(Ap)(A—Ag) HIU ADAN)? + 


where the primes denote the derivatives with respect to À. In an equilibrium 
state the potential energy of the field must have a minimum, so that 


u'(Ap) = 0 and u(y) > O. 
Hence the probability distribution (55.6) can be written in the form 


"(A,)(A—A,)? 
u (AA Àg) Jan 


XT, (55.7) 


dw = const : exp (2 

The probability distribution (55.7) is called the Gaussian distribution. -The 
value of the constant u" Op) depends on. the nature of that real or fictitious 
field of force in which the system is “displaced” from the position Xo to the 
position A. By means of the probability distribution of small fluctuations 
(55.7) one can find the mean value of the fluctuation of the parameter A: 


u''(Ag)(A-Ay)? ) D 


A2 = (A—Ag)? = const f (A=)? exp (- OKT. 
moh (I) 


The constant in (55.7) is determined by the normalization condition 


“(Ag )(A—Ag)” -1 
const = [exp (Ae) a] : 
0 


Thus, 4 
u AAA) 
ns S00? exp (- a) dà 


n" 2 
fexp (5 (Ag) A=Ao) ) an 
2kTo 





(55.8) 


The fluctuations of the parameter A occur in both directions from its value 
in an equilibrium state. Since the integrand in the integrals in the numerator 
and denominator of the expression (55.8) rapidly decreases with increasing 


276 THEORY OF FLUCTUATIONS Ch. 8 


absolute value of the difference A — àg, the integration can be carried out in 
the range from —° to +œ, as we have done in normalizing the Maxwell dis- 
tribution. Thus, finally, 


S u” ADA)? 
Nua EA 
AON) exp ( AT Jan ae 
A2= = == : (55.9) 


AAA u (Ao) 
ies exp (- = ao Ja 


By means of formula (55.9) the probability distribution (55.7) can be written 
in the form 








2 
dw = (2nA2)-2 exp ("| d 
2A2 
The probability of a given fluctuation decreases sharply with its increasing 
value, as well as with decreasing A?. The latter quantity is proportional to the 
absolute temperature. Hence it can be stated that the intensity of fluctuations 
decreases with decreasing temperature *. 
In the following sections the general relations obtained will be applied to 
concrete cases of small fluctuations in macroscopic systems. 


§56. Brownian motion 


As the first case where fluctuations turn out to be easily observable we 
shall consider the so-called Brownian motion. Brownian motion is the con- 
tinuous chaotic motion of small particles suspended in a liquid or a gas which 
may be observed under a microscope. 

A complete quantitative theory of Brownian motion, which not only ex- 
plained its nature but also allowed one to predict a number of its character- 
istic features, was developed in the studies of Einstein and Smoluchowski 
(1905—1906). The investigations of Brownian motion played a major role in 
the triumph of the molecular-kinetic theory because Brownian motion was 
the first physical process in which the existence of molecules was detected in 
a direct and obvious way. The significance and importance of the theory of 
Brownian motion are not confined to the historical aspect. On the contrary, it 


* For an exception to this rule see B.G.Levich, Vvedenie v statisticheskuyu fiziku 
(ntroduction to statistical physics) (Gostekhizdat, Moscow, 1954) § 63. 


§56 BROWNIAN MOTION 277 


is only relatively recently that a number of cases of Brownian motion have 
become of particular interest in association with the creation of new, very 
accurate measuring devices (see §58). 

Passing on to the analysis of the theory of Brownian motion, we shall 
consider a macroscopic particle suspended in a volume of liquid or gas, and 
shall seek the forces acting on it due to the molecules of the medium. The 
molecules of the medium are in continuous thermal motion. Hence the mole- 
cules of the liquid or gas in which the particle is suspended will continually 
collide with the particle and transfer momentum to it in each collision. In 
other words, the molecules of the medium will exert a pressure on the surface 
of the particle. The collisions of the molecules with the surface of the particle 
are completely random, from all directions. If the surface of the particle is 
sufficiently large, so that a large number of molecules impinge on it in a very 
short time interval, then it can be assumed that the momenta which are 
transferred to the particle from all directions are, on the average, balanced. 
The situation is different in the case of very small particles (with a size of the 
order of 10+ cm). Such particles still contain an enormous number of mole- 
cules and are macroscopic bodies. Nevertheless, the surface of such particles 
is so small that in a short time a relatively small number of molecules impinge 
on it. The resultant of the forces exerted by the molecules of the medium on 
the surface of the particle turns out to be different from zero. As a result the 
particle will be given a random motion whose direction and velocity will vary 
at a very high frequency (of the order of 1012 times per second). The 
character of the displacements is shown in fig. II1.35, in which the position 


































































































Fig. 11.35 





278 THEORY OF FLUCTUATIONS Ch. 8 


of the particle every 30 seconds is drawn. The side of each square corresponds 
to a distance of 3X 1074 cm. 

The number of molecules striking the particle and the momentum which 
they transfer to the particle will undergo large fluctuations. Thus, the 
Brownian motion is due to the fluctuations of the pressure exerted by the 
molecules of the medium on the particle suspended in it. Although its 
motion is not directly a molecular motion, it serves as a kind of indicator of 
molecular motion. As we have more than once stressed, the phenomena of 
fluctuations contradict the propositions of pure thermodynamics. This can 
be illustrated particularly clearly by the example of Brownian motion. 

The very fact of the persistence of Brownian motion and the impossibility 
of abolishing it points to the continuous violation of the requirements of the 
second law of thermodynamics. Indeed, if a particle suspended in a medium 
obtained a single momentum from any external source, then its motion would 
rapidly be slowed down as a result of the energy loss due to viscous friction. 
Hence the impossibility of abolishing Brownian motion is indicative of the 
existence of processes which are the reverse of the processes of viscous friction 
and which are accompanied by a decrease in the entropy. To maintain its 
motion the particle continuously draws energy from the medium surrounding 
it, which directly contradicts the second law of thermodynamics. 

In order to construct a quantitative theory of Brownian motion we can 
make use of the general relations introduced in the preceding paragraph. 

Let a very small but macroscopic particle with mass be suspended in a 
certain medium, a liquid or a gas. We assume that the position of this particle 
is characterized by a certain parameter (a generalized coordinate) A. Such a 
parameter can be, for example, the distance of the particle from a certain 
plane of the container which is chosen as the origin (other examples will be 
given below). The particle will be acted upon on the part of the medium by a 
force which varies rapidly and randomly in time and is due to the fluctuations 
of the thermal motion of the molecules of the medium. Under the action of 
this fluctuation force, which we shall call for brevity Brownian force, the 
particle will undergo very small displacements, so that the value of its 
parameter A will vary continually by very small amounts AÀ. 

Instead of observing the motion of one particle in time one can, following 
Einstein, consider a great number of identical particles undergoing Brownian 
displacements, and find the number of particles passing through a certain 
imaginary surface in the medium. 

We denote by c(A) the number of particles per unit volume which are at a 
distance between A and A + dA from the surface À = 0. 

Let A= [(Ad)2]” denote the mean square displacement of particles in a 


§56 BROWNIAN MOTION 279 


certain short time 7. Then on the average [ec(A-$.A)J A particles moving 
from left to right will pass through 1 cm? of an imaginary surface in the 
solution during the time 7. Analogously, [;c(A+5A)] A particles moving in the 
opposite direction will pass through this surface in the same time. As a result 
the following number of particles will pass through 1 cm? of the imaginary 
surface: 


N= jr = [2c(A-4.4) — 3c] 4, (56.1) 
where jis the particle flux. 


Assuming A to be small and c(A) to be a slowly varying function of the 
coordinate A, we can write 


A? dc 
aaa 
or 
._ A?ae 
I~ Or OX" (56.2) 


The flux of particles is proportional to the gradient of their concentration 
and is directed in the direction of the decrease of the concentration. The 
factor of proportionality A?/27 is called the diffusion coefficient D: 


2 
D= A, (56.3) 


Thus, the mean square displacement of the particle is equal to 
A? =(Ad)? = 2D7 . (56.4) 


The mean part traversed by the particle turns out to be proportional to the 
square root of the observation time 7. 

The diffusion coefficient D can be expressed in terms of the temperature 
and physicochemical constants of the medium. That is, we assume that the 
flux of particles is due not only to the gradient of concentration but also to an 
external force f acting on every one of the particles. 

Under the action of a force f a small particle in a viscous medium moves 
with a velocity u which for steady motion is equal to 





280 THEORY OF FLUCTUATIONS Ch. 8 


u=bf= zi. (56.5) 
where b=(Cna)-! is a quantity called the mobility of the particle, a is its 
radius, 7 is the viscosity of the medium, and C is a numerical coefficient 
equal to 67 for spherical particles. Formula (56.5) is called the Stokes formula. 
A simple calculation shows that the time required to establish steady motion 
is very short for small particles. 

The total flux of particles can be written in the form 


PE TCE 

7=—D Or t+uc (56.6) 
or 

dc = OCR, 

j=—D ar + bfe =—D aA be BA? (S6.7) 


where U is the potential energy corresponding to the force f. 

We now assume that the flux of particles caused by the external field has 
the same value as that produced by the gradient of concentration but the 
opposite direction. Then the total flux of particles j reduces to zero. In this 
case the particle concentration distribution is determined by the condition 


ep OY. 
—D an be ar 0 
or 
c=cy,e OUD | (56.8) 


On the other hand, we know that particles which do not interact with each 
other follow, in an external field, the Boltzmann distribution, and 


C=C, ev UIT (56.9) 
Comparing these last expressions, we find 


D=bkT. (56.10) 


§56 BROWNIAN MOTION 281 


Thus, the diffusion coefficient of particles is connected with their mobility 
by the universal formula (56.10), which for spherical particles assumes the 
form 


kT 


D= Sande (56.11) 





Substituting the value of D from (56.11) into (56.4), we find 


(xT A 12 
=(7) 72. (56.12) 


Thus, the mean path traversed by particles increases with the temperature of 
the medium. 

All the quantities contained in formula (56.12) are known or can be 
measured. It should be noted that the value of the Boltzmann constant k was 
at one time determined from formula (56.12). The degree of agreement of 
these formulae with experiment can be estimated from the fact that in 1910— 
1915 the value of Avogadro’s number N = R/k found from measurements of 
Brownian motion (V=6.44X 1023) was considered to be one of the most 
accurate values of this quantity. 

Experiments on Brownian motion demonstrated in a direct and obvious 
way one more important inference of statistical mechanics. That is, the 
statement on the reversibility of molecular processes. The experiments 
consisted of observing the number of Brownian particles in a sharply 
limited (for example, by the corresponding illumination) field of view under 
a microscope. Owing to the Brownian motion, particles enter and leave the 
field of observation from the non-illuminated part of the solution. Let us 
assume that at a certain instant of time the concentration of Brownian 
particles in the field of observation was higher than in the remaining solution. 
According to the second law of thermodynamics equalization of concentra- 
tion must then take place by means of the diffusion of particles from the 
illuminated volume to the non-illuminated one. After the final equalization of 
concentrations in the system a total equilibrium must be established which 
should not subsequently be violated. From the point of view of statistical 
physics the phenomenon should proceed in a completely different way. The 
number of particles in a sufficiently small volume should increase and de- 
crease equally frequently, so that the notions of diffusion and equalization of 
concentrations would lose any meaning. After the lapse of a recovery time r* 
the number of particles which was initially equal, say, to n, should come back 


ni 





A= [(AA)?] 


282 THEORY OF FLUCTUATIONS Ch. 8 


to the same value. The length of the recovery time was calculated by 
Smoluchowski. As we have mentioned already, it increases sharply with the 
size of the system, in the case given with the value of the number n. The 
results of the observations are given in tables 9 and 10. 

















Table 9 
Observed frequency of variation of the number of particles n >m in the viewing field 
n m=0 m=1 m=2 m=3 m=4 m=5 m=6 
0 210 126 35 7 0 1 = 
1 134 281 117 29 1 1 = 
2 27 138 108 63 16 3 - 
3 10 20 76 38 24 6 0 
4 2 2 14 22 13 11 3 
5 -= 0 2 10 10 1 3 
Table 10 
Mean recovery time 
Observation 7*(obs.) 7*(cal.) 
time 
0 6.1 5.5 
1 3.1 3:2 
2 4.1 4.0 
3 7.8 8.1 
4 18.6 20.9 





In the first table the frequency of variation of the number of particles 
in the field of observation is given. The number of particles in this field in the 
first observation is denoted by n, while m denotes the number of particles in 
a subsequent observation. The frequency of a transition n>m means the 
number of cases in which n was replaced by m. For example, the number 27 
in the third row of the second column of table 9 means that in 27 cases the 
number of particles which in the first observation was equal to two in the 
second observation reduced to zero. The mean number of particles which 
should have been found in the field of view amounted to 7= 1.43. The 
measurements were carried out in time intervals At = 1.39 sec. 

An analysis of the numbers of table 9 indicates immediately the correct- 


§56 BROWNIAN MOTION 283 


ness of the statistical point of view, and serves as a direct illustration of the 
reasoning of §25. Indeed, according to the propositions of thermodynamics, 
we should expect a continuous decrease of the number of particles (m<n) in 
the case where the initial value n >”, and an increase of the number of 
particles in the reverse case. Nothing of the sort is found in table 9. On the 
contrary, for » >”, in subsequent observations an even larger number of 
particles is very often observed. Thus, for n = 3 a smaller number of particles 
(m=0,1,2) is observed in 106 cases in the second observation, and a larger or 
the same number of particles (7=3,4,5) is observed in 68 cases. 

From table 9 it is seen that for a small number of particles the numbers 
which stand on the two sides of the major diagonal are practically equal to 
each other. For example, the frequency of the transition from n=3 to 
m= 0 amounts to 10. The frequency of the transition from n = 0 to m= 3 is 
equal to 7. The frequency of the transition from n = 2 to m = 4 is equal to 16, 
while that from n = 4 to m = 2 is equal to 14, and so on. This means that the 
process of Brownian motion has a strictly reversible character. Fluctuations 
occur so frequently that no systematic trend of them with time is observed. 
If, however, the number of particles n is considerable, so that the scale of 
fluctuations is large, then in correspondence with the reasoning of §25 a 
resolution of the fluctuations can be expected. In this case the most probable 
trend of the process is the same as that predicted by thermodynamics: 
particles will most often go out (diffuse) of the observation zone, and the 
number of particles in it should in most cases decrease. 

From table 9 it is seen that for n = 5 (such an n already sufficiently sub- 
stantially exceeds 7) the number of particles decreases in 22 cases and only in 
4 cases does it increase or remain constant. If the number of particles n were 
very large and very much exceeded the mean number 7, then it would decrease 
in an overwhelming majority of cases, and the process would become irre- 
versible. 

The data of table 10 are no less convincing. This table gives the calculated 
and observed times of recovery of the number of particles in the observation 
field (in units of At=1.39 sec) for a suspension with a mean number of 
particles 7 = 1.55. From the table it is seen that after the lapse of time inter- 
vals T*, which are in good agreement with those calculated theoretically, the 
number of particles initially found in the field of view again recurs. The 
recovery time increases sharply with the value of the departure of n fromn, 
so that large fluctuations are very seldom repeated (see also table 9). All 
these facts are convincingly indicative of the validity of the molecular- 
statistical point of view. 








284 THEORY OF FLUCTUATIONS Ch. 8 
§57. Fluctuations of thermodynamic quantities in a homogeneous system 


We shall now consider the fluctuations of thermodynamic quantities with 
reference to a system in a reservoir. 

A quantitative measure of the probability of a fluctuation is the work 
which must be done on the subsystem in order to bring it from the initial 
equilibrium state into the final fluctuation state. 

Because of the smallness of the fluctuations the transitions can be assumed 
to be reversible. 

The work done in a reversible transition for a system in a medium is ex- 
pressed by the general thermodynamic formula (28.7): 


AW=AE~T)AS+p,AV, (57.1) 


where AE, AS and AV are the changes in the corresponding quantities for the 
transition from the initial to the final state. A concrete expression for the 
work AW can be obtained for particular cases of the process. 

We shall confine ourselves to the calculation of this work for fluctuations 
of the volume at a constant temperature and fluctuations of the temperature 
at a constant volume. 

We shall first of all consider the fluctuations of the volume at a constant 
temperature (7=7=const). 

The work done in an isothermal change in the volume at a constant temper- 
ature is equal to 


AW = AE — A(TS) + ppAV=AF +p, AV. 57.2 
0 0 


It should be stressed that formula (57.2) shows that the work AW represents 
the work done on the subsystem by an external source of work (but not by 
the medium). 

For a small isothermal change in the volume AV the free energy in formula 
(57.2) can be expanded in a series in powers of AV, and rewritten in the form 


2 2 
AW= pav + ($E) AV + (Fe al OVE nie 





OV, av2 2 
2 
PAV -p pAV — (3), eo A (57.3) 


Since the process can be considered as quasi-static, in the process of fluctua- 


§57 FLUCTUATIONS OF THERMODYNAMIC QUANTITIES 285 


tion the equilibrium pressure in the subsystem can be assumed to be equal to 
the pressure in the medium. Hence we find finally 





az) ADE (57.4) 
z 


aw=- (2 2 


Substituting (57.4) into (55.5), we find the probability for the volume V 
to lie between V and V + dV: 


2 
dw = const - exp| (28) or ev : (57.5) 
fis 





The constant is found from the normalization condition: 


-f l (22) GY lav 
const J exp| (2 T oT dy =y: (57.6) 


From formulae (57.5) and (57.6) it follows that the derivative (3p/ðV)r 
must be negative. If this condition turned out to be unfulfilled, the proba- 
bility of fluctuation would not decrease but increase with its magnitude. In 
such a substance there would occur volume fluctuations as a result of which 
the volume of the system would indefinitely increase or decrease down to 
zero. The substance would be in an unstable state. Thus, the condition of 
stability of the states of a homogeneous substance is given by the formula 


P3: <0. (57.7) 


If condition (57.7) is fulfilled, the integral (57.6) can easily be calculated. 
Then - 


mele ay 
Opdr ` 


The normalized probability distribution of isothermal volume fluctuations 
has the form 





286 THEORY OF FLUCTUATIONS Ch. 8 


By means of the probability distribution (57.8) we find the mean square 
fluctuation of the volume (AV)? =(V— V). Obviously, we have 


(AV)? = 0-7) = 





((dp/8V)7|\_ 7 (V-Vo)? 
-( InkT S orl- OKT 2- )av= 
= T 

I@p/3 Vri E 


Introducing the value of (A V)2 into the distribution (57.8), one can rewrite 
it in a more compact form: 





dw = [2n(AV)2]2 | Riy (57.10) 
wW T! ex — 5 i 
P XAV)? 


It follows from formulae (57.9) and (57.10) that the scale and probability of 
fluctuations increase with increasing temperature of the substance, as well as 
with increasing isothermal compressibility. 

We apply formula (57.9) to the case of an ideal gas: 


kT V2kKT V2 
2 Sey RS a ee A 
SP \(@p/aV) 71 NkT N` (57.11) 





In what follows we shall be interested in the value of the mean square 
fluctuation of the density p = vw! = mV-! (where m is the mass contained in 
the volume V in which the fluctuation takes place). We have 





REG) 1 _m? _m? kT Bo 
(Ap)? = m? (a;) =z = AY? V2 = == kTyr, 
y V2 V2\(Əp/ƏV)r V T 


where yy is the isothermal compressibility. The relative density fluctuation 
in the volume V is equal to 


2 kTy 
(22) =. (57.12) 


We shall also find the fluctuation of the number of particles confined in a 
given volume. The quantity (AV)? represents the mean square fluctuation of 


§57 FLUCTUATIONS OF THERMODYNAMIC QUANTITIES 287 
the volume V of the system which contains M particles. The fluctuation of the 
volume per particle V/N is equal to 
Vay k 
(_ 
N*\(0p/dV) rl 


Assuming the volume V to be fixed, we find the fluctuation of the number of 
particles in this volume: 


mO ENKET. 1 
2=— R. 7 
(AN) y2 |(@p/dV)r\ EH) 


In particular, for an ideal gas 





(AN)? =N. (57.14) 


The independence of the fluctuation of the number of particles in a given 
volume on the temperature of the ideal gas is associated with the fact that in 
an ideal gas the motion of each particle is independent of the motion of other 
particles. With increasing temperature in an ideal gas only the mean square 
velocity increases, but the character of the motion does not change. 

Formulae (57.8) and (57.9) lose their meaning in the case where the iso- 
thermal compressibility becomes infinite (and the derivative (0p/dV)-7 re- 
duces to zero). A formal application of formula (57.9) leads to an absurd 
result: an infinitely large fluctuation of the volume. In reality, however, when 
the derivative (0p/dV)7 reduces to zero the expression (57.4) for the work 
AW, on which the derivation of formula (57.9) is based, changes. The ex- 
pansion of (57.3) must be continued, so that instead of (57.3) we must write 


2 2 3p 3 
aw~pyav (3E) av+(2 z) (4V) +(2 E) (Qu 
T IR T 








aV av2 2 av3 6 E 
X EAA 2) ao a 
© py AV — pAV (28 ee a) ri (57.15) 


We assume that the second derivative (3?p/ð V?)p is different from zero. 
Then, dropping infinitesimal quantities of higher order in (57.15), one can 


write 
aw=—(22) Gn (S716) 
av2/r 6 





288 THEORY OF FLUCTUATIONS Ch. 8 


Substituting (57.16) into the normalization condition (57.6), we see that it 
cannot be satisfied for any value of the constant (d2p/d V2). This means 
that the assumption that the condition (d2p/d V2)r #0 can be fulfilled for 
(dp/0V)7 = 0 leads to a contradiction. 

Hence it is seen that if (0p/0V)7 = 0, then at the same time the condition 





2 
(2) Eo (57.17) 
ALS fi 


must be fulfilled. 

These two conditions determine the position of the critical point (see § 64). 

At the critical point the probability of a density fluctuation turns out to 
be considerably higher than in the ordinary state of substances, because here 
the work done in an isothermal change in the volume is very small. 

It should, however, be stressed that for finding a quantitative expression 
for the probability distribution of fluctuations at the critical point the use of 
formula (55.5) with the substitution of the expansion (57.15) in it would be 
incorrect. 

In the critical state of a substance its compressibility is so large that small 
forces have a great effect. Because of this, fluctuations here are not only large 
but, what is most important, lose their local character. This means that it is 
senseless to make a statement about the volume fluctuation at a given point 
of the substance *. 

Now consider the fluctuations of the temperature of a subsystem at 
constant volume. The work which should be done on the subsystem in order 
to bring it from an equilibrium state with temperature To into a non-equilib- 
rium state with temperature T is equal to 


AW= AE -T)AS. 


We expand the energy change AZ in a series in powers of AS and confine 
ourselves to the first terms of the expansion. Since the energy is a potential 
with respect to the entropy and volume, we have 


* The theory of fluctuations for a substance at the critical point cannot be expounded 
in this book. The reader is referred to the following book: L.D.Landau and E.M.Lifshitz, 
Course of theoretical physics, Vol. 5: Statistical physics (Pergamon Press, London, 
1958). 


§57 FLUCTUATIONS OF THERMODYNAMIC QUANTITIES 289 


daw (OB a7£\ (AS)? _ (22) (AS)? 
ae ~(55),, (05) +(25), Zina Oes NaS yes 


But 


Hence, finally, 





2 
aw= roas + (32) G — TAS = 
paz 


2 2 2_ Cy 
- (22) (25) (AT) - (2) (AT)? = “1 (ar)2. 
as), \aT/, 2 Oy aeons 





The probability that the temperature of the subsystem will undergo a 
fluctuation and that its temperature will lie between T and T + dT is equal to 





A a) 
dw = const : exp E (57.18) 
Normalizing the distribution (57.18), we find 
Cy \3 Cy(T-T,)? 
aw=( Z J’ exp (2 er. (57.19) 


2 2 
2mkT 2kT) 


From the distribution (57.19) it follows that the heat capacity of a 
homogeneous substance at constant volume must be an essentially positive 
quantity. Otherwise the substance would be in an unstable state. Thus, in 
addition to (57.7) we obtain the second condition of stability of the states of 
a homogeneous substance: 


Cy>0. (57.20) 


If the heat capacity of the body were negative, then the body could be heated 
taking heat away from it. In other words, a perpetual motion machine of the 
second kind could be constructed. 

It can be shown that the fluctuations of the volume and temperature are 
independent. Without dwelling on a strict proof of this statement, we note 


290 THEORY OF FLUCTUATIONS Ch. 8 


only that it also follows from general physical reasoning. The state of a homo- 
geneous body is completely determined by three thermodynamic parameters 
which are interconnected by one relation: the equation of state. Hence 
changes in two thermodynamic parameters in a homogeneous body can 
always take place independently of each other. In a homogeneous substance 
the conditions (57.7) and (57.20) are sufficient for the stability of the states 
of the system. It should be noted that the necessary conditions for stability 
are the constancy of the temperature and pressure in a homogeneous system. 

In conclusion we note that the conditions of stability which we have 
obtained are not necessarily fulfilled in a non-homogeneous system, for 
example in a system placed in a field of force, or in a system consisting of 
several phases. In this case the state of the system depends, for example, on the 
strength of the external field as well as the parameters p, T, S and V, and 
other quantities. Hence the expressions for the work done in a fluctuation 
and the conditions of stability will be changed. 


§58. Effect of fluctuations on the sensitivity of measuring devices 


Fluctuations play an important role in the operation of modern sensitive 
instruments: balances, galvanometers, etc. The sensitivity of such devices is so 
high that they allow one to observe phenomena of the same scale as fluctua- 
tions caused by the thermal motion of molecules in the device itself. This 
leads to an important consequence: in a single direct measurement of a 
physical quantity whose value is smaller than the fluctuations of the device 
itself, the latter records its proper thermal motion (background) and not the 
quantity to be measured. In this sense it is said that the thermal motion 
imposes a limit of sensitivity upon a given form of the device (for a single 
measurement). 

A further increase of the sensitivity and the measurements of quantities 
which lie below the thermal motion background are associated with carrying 
out repeated measurements (or with a change in the construction of the 
device). 

Indeed, if a device records only its proper thermal motion, then its mean 
deflection will be equal to zero. But if an external action is superposed on the 
background, then the device will fluctuate about a new position, and its mean 
deflection will be different from zero. The larger the number of measurements 
carried out, i.e. the longer the observation time, the lower the values of a 
physical quantity (lying below the background) which can be measured. 

We shall illustrate this by the analysis of several simple examples. 





§58 SENSITIVITY OF MEASURING DEVICES 291 


A suspended small mirror. One of the simplest and most sensitive devices is 
a light small mirror suspended on a fine filament, usually of quartz. The 
sensitivity of the device is determined by the possibility of recording very 
small angles of rotation of the mirror on the filament. The limit of sensitivity, 
i.e. the smallest angles of rotation which can be recorded in single measure- 
ments are determined by the fact that they must be larger than the oscilla- 
tions of the mirror caused by the thermal motion of the molecules of the 
mirror and filament. This thermal motion leads to random rotations of the 
suspended mirror through angles whose order of magnitude is determined by 
the value of the mean square angle of rotation. We shall calculate this 
quantity. 

In order that the mirror “randomly”, i.e. under the action of the mole- 
cular thermal motion, be deflected from the equilibrium position y=0 
through a certain angle gy, it is necessary that work be done against the elastic 
forces of the filament. This work is produced on account of the energy of the 
thermal motion. The angle y plays the role of the parameter determining the 
deflection of the system from the equilibrium position. The probability of the 
deflection of the system from the equilibrium position y = 0 through an angle 
y is determined by formula (55.6) in which the potential energy of the 
torsion of the filament will stand for the potential energy. For small angles of 
rotation 


u(y) = zag" , 


where a = 1r2G/2I (here r is the radius of the filament, / is its length, and G 
is the shear modulus of the material of the filament). 
Thus, 


e-ay?/ 2kT dy 


dw = const - ev” /2kT dy = (58.1) 


co 


TER e-a ]2kT dy i 


Here the constant is determined from the normalization condition. 
The mean square angle of the deflection is equal to 


co 2i 
ma Sea Sadie ALARN 
JO e IKT dy a 


This result has a simple meaning: the mean potential energy of our system 
with one degree of freedom is equal to 


nA 





292 THEORY OF FLUCTUATIONS Ch. 8 





u = say? =3kT (58.2) 


in accordance with the law of equipartition. For T= 300 K anda = 1076 erg 
(such a value of a is possessed by very fine quartz filaments) we have (v2)? = 
2X 10-4. This quantity determines the angle through which the small mirror 
rotates, one the average, “by itself”. If a quantity measured from a deflection 
of the mirror causes a rotation through a smaller angle, then in a single 
measurement it is the proper deflection due to thermal motion which is 
recorded. 

It is clear, however, that in the absence of a systematic deflection force the 

mean deflection of the small mirror will be equal to zero, while in the 
presence of such a force the mirror will undergo oscillations about a displaced 
position of equilibrium. Carrying out repeated measurements of the oscilla- 
tions of the small mirror one can find the mean position about which the 
oscillations take place. By this means one can determine a quantity whose 
values lie below the thermal background or sensitivity for a single measure- 
ment. 
Spring-balance. Completely analogous results can be obtained for the spring- 
balance. The pressure fluctuations of the air surrounding it and the thermal 
motion of the mechanism of the balance will lead to random changes in the 
load of the balance. This change in the load will be compensated for by a 
quasi-elastic force kAx. The change in the potential energy of the system for 
a displacement by Ax is equal to 


u=$K(Ax)2 . 


The mean potential energy, according to the law of equipartition, is equal to 
4kT. Hence the mean change in the length of the spring is equal to 


——~_1 1 
[(4x)?]? = (KT/K)2 . (58.3) 
The measurement of a mass m by the balance is possible if the extension of 
the spring caused by it is larger than the length fluctuation [(Ax)?]? of the 


spring. The extension of the spring by a load m is euqal to Ax = mg/k. Hence 
the limiting small mass which can be found in a single measurement is equal to 


marg! [(Ax)2]2 = (kT)? g7! 


Gas thermometer. Suppose that we are measuring a temperature using a gas 
thermometer filled with an ideal gas. The temperature measured by the ther- 


§58 SENSITIVITY OF MEASURING DEVICES 293 


mometer will not remain constant but will continuously undergo fluctuations 
just as other thermodynamic quantities. 

In an ideal gas the fluctuation of the temperature can easily be expressed 
in terms of the fluctuation of the volume. From the Clapeyron equation it 
follows that 


where AT and AV denote small changes in the temperature and volume. If 
small changes in the volume are understood to be changes due to fluctuations, 
then one can write 





AV = [(AV)2]2 


(44% V-Z 
(—dp/dV) 7 Ni’ 
so that 
— l 
AT = [(AT)?]2 = TV-LAV = TN73 . 


Changes in the temperature which are smaller than AT cannot be measured 
by means of a gas thermometer. If the thermometer contains altogether 104 
mole of a gas (i.e. if its volume is 0.02 litre), then N= 6 X 1023 X 1074 = 
6X 1019, so that the minimal measurable change in the temperature is 


AT 107197. 


This is so small that all really measurable changes in the temperature are 
extremely large in comparison with the limit of sensitivity. 

Thus, the sensitivity of a gas thermometer is not in practice limited by 
fluctuations in the temperature. The examples given show that the effect of 
fluctuations on the sensitivity of devices varies widely depending on the 
character of the device. 











Systems with a Variable Number 


of Particles 


§59. The Gibbs grand canonical distribution 


In considering in §13 the interaction of a subsystem with the bodies 
surrounding it (the reservoir) we have assumed that this interaction consists 
only of an energy exchange. In reality, however, the interaction of a sub- 
system with its surroundings amounts not only to an exchange of energy but 
very often also involves an exchange of particles. In the process of interaction 
the subsystem exchanges particles with the medium surrounding it. Particles 
which are going out and coming in carry energy with them, so that the 
energy exchange and particle exchange take place simultaneously. In this case 
not only the energy but also the number of particles in the system is 
variable. In order to characterize the state of such a system it is insufficient 
to indicate the total energy of the system; it is also necessary to indicate how 
many particles are contained in the system. Owing to the interaction with the 
surroundings, the subsystem singled out can be in different quantum states 
which differ by the number of particles contained in the system. Before 
passing over to the derivation of the statistical distribution for this case, we 
shall present certain examples of subsystems with a variable number of 
particles. 

We assume that our subsystem represents a macroscopic drop or crystal 


294 


§59 GIBBS GRAND CANONICAL DISTRIBUTION 


w 
o 
W 


which is in equilibrium with the vapour or melt respectively. The latter play 
the role of the surroundings (the reservoir). Molecules from the surface of 
the liquid go to the vapour, while molecules from the vapour condense on the 
surface of the liquid. The same occurs with molecules on the surface of the 
crystal. If there is no systematic transfer of particles from the vapour to the 
liquid or vice versa, then in the system an equilibrium state will be established 
in which the number of particles going in the two directions is equal. 

As another example of a system with a variable number of particles we 
consider one in which an equilibrium chemical reaction occurs. In the course 
of the chemical reaction the number of particles in the subsystem which has 
been selected changes (for example, the molecules of a compound AB): it 
decreases owing to the decomposition AB > A+B and increases owing to the 
synthesis A+B > AB. 

In an equilibrium state a continuous exchange of energy and particles 
takes place between the subsystem and the reservoir. Equilibrium conditions 
for the energy exchange were found to be the equality of the temperatures 
and pressures. An additional equilibrium condition for the exchange of 
particles will now be found. 

We consider the derivation of the statistical distribution of a system with 
a variable number of particles, i.e. the distribution of the probabilities w;, 
that the subsystem will be found in the ith state and contain n particles. 
Finding the statistical distribution in this case differs from the case considered 
in §16 only by the fact that the number of states of the subsystem with 
given energy Q(e;) must be replaced by the number of states with given 
energy and given number of particles (2(¢;,). Correspondingly, the number 
of states of the reservoir will be Qo(£o,No9). The sum of the number of 
particles in the subsystem and reservoir remains constant: 


N=n+ No = const. 
Then instead of formula (16.5) we obtain 
Win © QE- E; N-n) Q(E;,n) , 
and, instead of (16.7), 
25(E—e;, N-n) = exp [o(E—e;, N—n)] . (59.1) 


Since the dimensions of the subsystem are small, its energy and the number 
of particles contained in it are small in comparison with the energy and the 





296 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


number of particles in the entire closed system: €; < Æ and n <N. Hence, as 
in §16, we can expand the function o(£—e;, V—n) in a series in powers of 
€e; and n and confine ourselves to the first terms of the expansion. This gives 





ðo ðo 
eo s 4 2 
Win © EXP [ oe.) (an), Ej (3) | n] Qen) , (59.2) 


l 
or 


e;un 


Win © const - exp E ri Jaen 5 (59.3) 





where the symbol “const”? denotes the constant quantity e°E) which does 
not depend on e; and n; @ denotes as before the statistical temperature 
(de;/00)_;=0, and 


Be (X 
5 ( shen (59.4) 


The derivative in formula (59.4) is taken at a constant value of the energy and 
the external parameters. The molecular meaning of the quantity u will be 
obtained in the next section. 

It should be stressed that, in contrast to 0, u can have any sign. Indeed, 
in formula (59.5) the summation is carried out over a finite number of 
particles, in contrast to the summation over an infinite number of levels 
in formula (16.12). 

The value of the constant can be found from the normalization condition: 


DD nE 
i n 


where the summation is carried out over all energy levels and all possible 
numbers of particles in the system. Obviously, we have 


un—E; 
const - De sy exp r ‘| Q(E,2)=1, 
UEN 





hence 


§59 GIBBS GRAND CANONICAL DISTRIBUTION 297 


—€; =] 
const = be Dey exp = Jacan] 3 (59.5) 
i n 


The probability distribution of the states of a system with a variable number 
of particles can finally be written in the form 


un—E; 
exp| r ‘| Qen) 
Win = (59.6) 


Page 2E “ps Yo ,n) 


Formula (59.6) differs from formula (16.13) only by the fact that instead 
of one variable characterizing the state of the system, the energy, it contains 
two variables: the energy e; and the number of particles n in the system. We 
shall call the probability distribution (59.6) the grand canonical distribution. 

We introduce the notation 


Z= exp |- g2 2 exp| EE t] 6, 


For a constant number of particles in the system  =7 the quantity Z is the 
same as the ordinary partition function. 


The probability distribution (59.6) can be written by means of Z in the 
standard form: 











Win = Zz exp [—e;/0] Qen) . 


The number of states of the system Q(e;,n) can (in the quasi-classical 


approximation) be expressed in terms of the volume AT of phase space 
according to formula (1.26): 


A n 
XE;,n) acne (59.7) 


where AT, is the volume of the phase space of a system containing n particles. 
It is obvious that with a change in the number of particles the number of 
degrees of freedom 3n and the value of the phase volume also change: 


298 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 
AT, = Aq; q3 ..- Aq3,, Ap; Ap --- AP3», - (59.8) 


Then for the probability that the system be in the energy state corresponding 
to the phase volume element dI,, and contain n particles we obtain 


_ exp [—e,/@] dI,, 


= (59.9) 
Zh?” 


dWin 


Knowing the probability distribution (59.6) or (59.9), one can find the 
mean values of all quantities characterizing a state of a system with a 
variable number of particles. 

According to the general formula for obtaining mean values, we find the 
mean value of any quantity L which depends on the state of the subsystem 
and on the number of particles: 


AA 7] 26. 


p= . (59.10) 








In particular, the mean value of the number of particles for an arbitrary 
value of the energy of the system is equal to 


un—€j 


D D a exp| Jam 
ae ft g ð (HOG 
n= =6 bu In 2 2 exp| g Qen). 


un—€j ; 
2 2 e| =] QUepn) (59.11) 


For a system with a variable number of particles it is natural to call Z the 
grand partition function or the grand sum (or integral) over states. 

It is convenient to introduce a quantity z which is called the activity and 
is by definition equal to z = e#/®. By means of the activity, Z can be written 
in the form 











N 
Ze Dy iz" Zp (59.12) 


n=] 


§59 GIBBS GRAND CANONICAL DISTRIBUTION 299 


where Z,, is the statistical sum for n particles. 
In the classical approximation we can write 


Z= D> eun/o Z= D> eun/o e fnlo ~ eWñ-F)/0 ~ e(uN-F)/6 


In the next paragraph it will be shown (see formula (60.4)) that WV = P, 
where ® is the Gibbs thermodynamic potential. Hence for Z we find 


Z = eP-F/0 = epV/o | (59.13) 


p=(0/V)InZ. (59.14) 
Analogously, (59.11) can be written in the form 


EEEN aInZ_ainZ 
n=N=0 aei NTE i 





(59.15) 


We now have to consider the establishment of the physical meaning of the 
parameter y. 

In §17 the physical meaning of the quantity @ introduced formally has 
been elucidated, and it has been shown that it represents the statistical 
temperature. The condition of statistical equilibrium between quasi-inde- 
pendent subsystems which can interact weakly with each other and exchange 
energy was the equality of their temperatures. It is remarkable that the 
formally introduced quantity u also turns out to have an important physical 
meaning which can be revealed by means of reasoning which is completely 
analogous to that of §17. 

Consider a certain system which is in a state of statistical equilibrium. We 
single out from it two subsystems which are also in a state of statistical 
equilibrium and are weakly interacting with each other. This interaction 
consists of a mutual exchange of energies and particles between the two 
subsystems. For each of them one can write the probability distribution of 
states in the form 


Hini E; 


Halo — Eg 
w; =A] exp| a Q) and w =A, exp| aia] 25, 
2 


300 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


where the index 1 marks the quantities referring to the first subsystem, and 
the index 2 marks the quantities referring to the second subsystem. 

Since the subsystems are quasi-independent, one can apply the theorem 
of multiplication of probabilities to them, and for the probability of finding 
simultaneously the first subsystem in the ith state and the second subsystem 
in the kth state on can write 


Hn —E; H2 —Ek2 
w2 =W1W2 =A] exp[ EH) A, ap tate QR. 
1 2 (59.16) 


On the other hand, the two subsystems together can be considered as one 
subsystem with an energy equal to the sum e€;,;+€,5 and with a number of 
particles equal to nį +n}. Since this subsystem is an equilibrium in state, one 


can also write the grand statistical distribution for it in the form 


aoe ae) (59.17) 


W12=A exp g 


If the subsystems are in equilibrium with each other, then their states 
should not change when an interaction is established between them. This 
means that the probability distribution of states in a system which is formed 
of two subsystems must remain unchanged. For this it is necessary that the 
expressions (59.16) and (59.17) should be identical. The latter condition 
requires, however, that the following equalities should be fulfilled: 


6=6,=6,, (59.18) 
M=My =U. (59.19) 


The first of these is the well-known condition of equality of temperatures 
in all quasi-independent subsystems constituting an equilibrium system. This 
condition has been obtained in § 17 for subsystems whose interaction reduced 
to an energy exchange. The second equality is essentially new. It shows that 
the quantity u which refers, as 0, to the reservoir (see §17), in a state of 
Statistical equilibrium must have the same value in all parts of the system. 

In addition to the conditions of the constancy of the temperature and 
pressure, the constancy of yu is a necessary condition for statistical equilibrium 
in a system. The appearance of the additional condition of equilibrium is 
associated with the fact that we are now considering subsystems which can 
mutually exchange not only energy but also particles. 


hee. 


§60 THE BASIC THERMODYNAMIC EQUALITY 301 


We shall call the quantity u the chemical potential of the reservoir. In the 
case where the subsystem which we have singled out is itself a macroscopic 
system, the conditions of equilibrium allow one to refer u to the system 
itself and not to the reservoir. Indeed, in an equilibrium state the chemical 
potentials of the reservoir and the macroscopic subsystem must be equal. 
Howeve., it makes no sense to speak of the chemical potential of a micro- 
scopic subsystem, for example, a molecule. It should be noted that the same 
also holds for the statistical temperature 8. It also represents the temperature 
of the reservoir, but for a macroscopic subsystem it can be identified with the 
temperature of the latter. However, one cannot speak of the temperature of 
an individual molecule. 

From the point of view of molecular concepts the condition (59.18) ex- 
presses the requirement that the amounts of energy which are given away and 
obtained by the subsystem should be equal to each other. The condition 
states that in the exchange of particles not only the numbers of particles 
coming in and going out of the subsystem must be equal to each other, but 
also the mean energies carried by the particles must be the same. If that were 
not so (for example, if only fast particles went out and only slow particles 
came in) then the equilibrium state would be violated. 


§60. The basic thermodynamic equality and the calculation of chemical 
potentials 


In order to elucidate the thermodynamic properties of a system with a 
variable number of particles it is first of all necessary to find the basic 
thermodynamic equality for such systems. The latter can be obtained most 
simply in the following way. 

Since the quantities involved in the basic thermodynamic equality (24.5) 
— the energy, entropy and volume — possess additive properties, this equality 
can be written not only for the quantities £, ø and V but also for the specific 
values of these quantities related to unit mass or to one particle. 

Let a system contain N particles. Then the energy, entropy and volume 
per particle can be written in the form E/N, o/N and V/N. Writing the basic 
thermodynamic equality for the specific values related to one particle, we 


TEETE] 


302 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Gheg 


It is obvious that this equality will hold irrespective of the cause of the change 
in the specific value of the energy and the other quantities, i.e. irrespective of 
whether this change is due to a change in the quantities themselves or to a 
change in the number of particles in the system. Hence in the equality one can 
consider NV to be a variable quantity and write 


EE eA py BY 
hence 

dE = @do — pay + (==2ster) dN. 
Setting 

wae ooer à (60.1) 
we obtain 

dE =0do — pdV + ud. (60.2) 


Whence it follows that the following equalities holds: 


S (OZ) ma (ee 
=a) ine 9 OR co) 


The comparison of formulae (60.3) and (59.4) convinces us of the identity 
of the quantity u determined by formula (60.1) and the chemical potential. 
The first of the equalities (60.1) shows that the chemical potential is equal 
to the derivative of the energy with respect to the number of particles. From 
the point of view of the practical calculation of chemical potentials the 
equality (60.1) is of special importance. lt shows that the chemical potential 
represents the Gibbs thermodynamic potential related to one particle: 


_E-00+pV _ (7) 
u N No: (60.4) 





The latter justifies the term chemical potential. The chemical potential p is 


§60 THE BASIC THERMODYNAMIC EQUALITY 303 


most conveniently expressed as a function of the pressure and temperature 
according to formula (60.4). 

Formula (60.2) represents the basic thermodynamic equality written for 
a system with a variable number of particles. 

We note in addition that, passing over in (60.2) from the energy to the 
free energy in the usual way, i.e. by subtracting the differential d(7S) from 
both sides of (60.2), we can write 


dF = —SdT — pdV + udN , (60.2') 


whence 


_ (0F i 
u= y (60.3') 


The relations obtained can easily be generalized to the case of systems 
containing particles of different kinds. In what follows we shall consider the 
statistical properties of systems with a variable number of particles, and we 
shall need concrete expressions for chemical potentials. They can be obtained 
for gases and crystals. 

By means of (37.11) we find for the chemical potential of an ideal 
monatomic gas 


w=—3kT InkT + kT np — k7j, (60.5) 


where the quantity 7, which is often called the chemical constant, is equal to 


jei (ap 
h2 


For a diatomic gas one can analogously obtain 





u=-—3kTInkT +kTlnp — kTj + kT In (1—e»/kT) + ep , (60.6) 


where the chemical constant j and the zero point energy €g are equal to 


AE 2nm)\3 8721]. asi 
j=ln =) m1 Eg =2hv. 


In the case of crystals, by the definition of the chemical potential we have 








304 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 
FtpV _F 
= =—+ 
5 N N a 


where F is given by formulae (53.7) and (53.8), and v denotes the volume 
per particle in the crystal. At a low temperature T < 0., 


_ KT (T\3 
SS (7) + pv. (60.7) 


Analogously at a high temperature T> @,, 


u= —3kT In Z -kT+pv. (60.8) 


c 


The last formulae involve the product pv which is contained in the 
equation of state of the crystal expressed in terms of quantities which are 
difficult to measure. However, in view of the smallness of the volume v per 
particle, one can in most cases drop the small term pv. 

In conclusion we shall make use of the value found for the chemical 
potential in order to write the Maxwell distribution in the form of the Gibbs 
distribution with a variable number of particles. For this we express u not in 
terms of the pressure but in terms of the volume of the system. We obtain 


“\3 
a N h? J 1 
BK=kT In [ V (© ail 4 (60.9) 


whence 





3 
iA) (a) eyo 
V \2nmkT h3 


Substituting this into the Maxwell distribution (9.3), we obtain 


dn = e(u—e)/KT z ; (60.10) 
1 


We note that the chemical potential of an ideal gas is a very large negative 
quantity. 


§61 CONDITIONS OF PHASE EQUILIBRIUM 305 
§61. Conditions of phase equilibrium 


One of the most important cases of statistical equilibrium in a system with 
a variable number of particles is that of phase equilibrium. Imagine a homo- 
geneous quasi-closed macroscopic system, which is separated from other 
bodies by an interface and is in an equilibrium state. We shall call such a 
system a phase of the substance. 

The concept of phase is a generalization and a more precise definition of 
the concept of state of aggregation. As an example ofa phase one can point to 
a vapour which is in equilibrium with its condensate. In this case the exchange 
of energy and particles between the vapour and its surroundings, the reservoir, 
takes place through the interface vapour—liquid or vapour—solid. As other 
examples we mention: a crystal in equilibrium with its melt; a crystal modifi- 
cation in equilibrium with another; an electron gas in vacuum in equilibrium 
with an electron gas in a metal. In what follows we shall quote examples of 
other, more complex phase equilibria. 

In all cases of phase equilibrium the existence of an interface between 
different phases is characteristic. In a state of statistical equilibrium the 
number of particles passing over from one phase into another, and the energy 
carried by them are exactly equal to the corresponding quantities in the 
reverse direction. If there were no interface sharply separating the phases, it 
would be senseless to speak of a particular quasi-closed subsystem as a phase. 
We shall illustrate this by an example. Imagine that two phases, representing 
uniform isotropic states of a substance, are in equilibrium, and are connected 
by a certain interface. Then we shall call the less dense phase the vapour or 
gas phase, and the more dense one the liquid phase. If however, we have a 
uniform system, then, as will be explained in detail in §64, the notion of 
liquid or gas is inapplicable to it: by changing the physical conditions of 
the system it can be transformed continuously from a state with a high 
density to one with a low density. The state with the high density cannot 
be called the liquid, and the state with the low density cannot be called the 
gas. They both represent cases of a uniform state of the substance. Thus, the 
presence of an interface between phases is the necessary condition for one to 
speak of the existence of phases and phase equilibria. 

Let us write down the condition for statistical equilibrium between phases, 
confining ourselves in the beginning to two phases of a substance. Each of 
the phases can be considered as a quasi-closed subsystem, and the set can be 
considered as a closed system in a state of statistical equilibrium. Hence the 
conditions of equilibrium between two phases can be written in the form of 
(59.14) and (59.15): 


fi 


306 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 
Hi (Pi T) = up T) - (61.2) 


In addition to these conditions it is necessary that the forces applied to the 
interface by the two equilibrium phases be equal to each other. Otherwise the 
interface between the phases would move and the equilibrium in the system 


would be violated. 
It is convenient to relate the condition of chemical equilibrium to unit 


interface, replacing forces by pressures. Thus, in addition to the conditions of 
equality of temperatures and chemical potentials one has the condition of 
equal pressures in the two phases. 

This simple reasoning can be strictly substantiated, by considering the 
condition for mechanical equilibrium, which is the requirement of minimum 
free energy of the closed system at T= const and 4“ = const. The condition 
of mechanical equilibrium in a system consisting of two phases at T = const 
can be written in the form 


dF = dF} + dF, =~—p,dV, — pad V2 = 0. 
Since the volume of the entire system remains unchanged, 
dV, =—dV, 
and 
Pi =P2- (61.3) 


Thus, the pressures in the two phases must be equal to each other. Taking 
into account the conditions (61.1) and (61.3), formula (61.2) can be written 
as 


H(p, T) = U2 (p,T) . (61.4) 


Since in a state of equilibrium T and p have equal values in the two phases, 
on the basis of eq. (61.4) one of these quantities can be expressed in terms of 
the other. Asa result of this we obtain the equation 


p=p(T) (61.5) 


§62 EQUATION OF THE EQUILIBRIUM CURVE 307 


for the dependence of the equilibrium pressure on the equilibrium temper- 
ature. Eq. (61.5) represents a certain curve in the (p,7) plane, called the 
phase equilibrium curve. All points of this curve correspond to the contact of 
equilibrium phases. At pressures which are larger and smaller than the 
equilibrium pressure at given temperature, one of the phases, the one which 
has a lower thermodynamical potential, is stable. If, for example, one of the 
phases is a liquid, and the second is its vapour, then the region in the (p,7)- 
plane which lies above the curve corresponds to the liquid phase, while the 
region lying below the curve corresponds to the gas phase. The equilibrium 
liquid—gas phase transition takes place along the curve. 

Correspondingly, in the crystal—melt equilibrium the region above the 
phase equilibrium curve corresponds to the crystal phase, melting points lie 
on the curve, and the stable phase below the curve is liquid. In a phase 
transition the liberation or absorption of latent heat takes place. The latent 
heat for the transition of a molecule from one phase into the other is equal 
to (since the process is a reversible and equilibrium process) 


l= [Tas , 


where s is the entropy related to one molecule. Since in the phase transition 
the temperature is constant, it can be taken out of the integral sign and one 
can write 


l= TAs = T(s7-s,) . 


Thus, the latent heat of a phase transition is equal to the difference between 
the entropies multiplied by the temperature of the transformation. The latent 
heat is taken to be positive if heat is absorbed in the phase transition. Latent 
heat which is liberated is taken to be negative. 


§62. The equation of the phase equilibrium curve. Equilibrium between the 
vapour and the condensed phase 


The dependence of the chemical potential on the temperature and pressure 
is known for only a few simple systems. In most cases the concrete form of 
the function (p, T) is unknown. Hence the equation of the equilibrium curve 
(61.5) cannot be written in an explicit form. However, it turns out that the 
differential equation of the equilibrium curve has a simpler form and contains 
only quantities which can easily be measured. 


Enone a 








308 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


To obtain the differential equation of the equilibrium curve we differ- 
entiate the condition (61.4). We have 


du; = dus (62.1) 
or 

OE CL a a it 

p 22% ayn Ey CP a 027) 


From formula (62.2) we find the slope of the equilibrium curve 


T ðu,/ðp — ðu2/ðp 
Eq. (62.3) is just the differential equation of the equilibrium curve 
sought. In order to bring it into final form it is necessary to express the 
quantities appearing in it in terms of those measured directly. According to 
(29.10) and (29.11) we have 


— = 0G 
S=—ar> V aa 
Hence 
OND Sea ou LV 
ar NTS? ap N: (62.4) 


$2 5 = (62.5) 


Replacing the difference between the entropies by the heat of transforma- 
tion /, we obtain 


IN 


dp _ 
ar OVT (62.6) 


Formula (62.6) is usually referred to one mole of the gas phase. 


§62 EQUATION OF THE EQUILIBRIUM CURVE 309 


Denoting the latent heat of the phase transition of one mole of a sub- 
stance JN by L, and the change in the molar volume by AV, we find finally 


L 
TAV: (62.7) 


ajea 
gis 


Formula (62.6) is called the Clapeyron—Clausius equation. The Clapeyron— 
Clausius formula relates the change in the equilibrium pressure p for an 
infinitesimal change in the equilibrium temperature T to directly measured 
quantities. We shall discuss it for concrete cases of phase equilibria in 
following sections. 

It is easily seen that if a phase transition takes place as the temperature 
increases, then the latent heat is always absorbed, i.e. L> 0. Indeed, 


ony #2) 


NOO a (62.8) 


The character of the temperature variation of chemical potentials in a phase 
transition taking place with an increase in the temperature is shown in 
fig. II1.36. Up to point 1 the stable phase is the first one, whose chemical 
potential 4, is smaller than that of the second phase u3. After point 1 the 
situation is reversed. At point 1 phase equilibrium occurs. At it the chemical 
potentials of the two phases are equal to each other. Its ordinate represents 
the temperature of the phase transition (at a given pressure). From fig. 111.36 
it is seen that at point 1 the slope of curve u) must be larger than that of 
curve 3. Otherwise above this point yy will not become larger than 43. 





Fig. 111.36 


310 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


Hence at point 1 we have 


Ou oe 
HE- ra 


Then it follows from formula (62.8) that if the phase transition takes place 
with an increase in temperature its latent heat is always positive. The 
numerical value of the latter cannot be found theoretically, since it is ex- 
pressed in terms of the entropies of the phases, and the explicit form of these 
functions is unknown in most cases. 

The theorem proved above allows one to establish the sign of the temper- 
ature coefficient of the equilibrium pressure dp/d7 for different phase transi- 
tions. If a phase transition takes place with an increase in the temperature 
(melting, boiling, sublimation), so that L is positive, then, according to the 
Clapeyron—Clausius formula, the sign of dp/d7 is determined by the sign of 
the quantity AV, the change in volume in the phase transition. In evaporation 
and sublimation the volume of the phase increases sharply, so that AV > 0 
always. Hence for these phase transitions dp/d7 is also positive, i.e. the 
equilibrium pressure increases with increasing temperature or, conversely, the 
equilibrium temperature increases with increasing pressure. As the pressure is 
lowered the temperatures of the boiling point and of the sublimation point 
decrease. Such a relation between the equilibrium pressure and the equilibrium 
temperature is in agreement with well-known experimental facts (increase in 
the boiling point in high-pressure boilers, decrease in the boiling point with 
the height, and so on). 

In melting two cases are encountered: when AV is positive, so that the 
density of the liquid phase is smaller than that of the solid phase, and when 
AY is negative, so that the liquid phase is more dense. For bodies of the first 
type dp/d7T > 0, so that the melting point increases with increasing pressure. 

The number of bodies which are more dense in the liquid phase is re- 
latively small. Examples of such bodies are water, cast iron, bismuth, and a 
number of alloys. For these dp/dT <0, i.e. the melting point decreases with 
increasing pressure. This feature of the melting of ice and other substance is 
well known. 

It is interesting to note that in the vicinity of absolute zero the temper- 
ature coefficient dp/dT tends to zero, so that the equilibrium pressure at the 
melting point ceases to depend on temperature. Indeed, from the third law of 
thermodynamics it follows that in melting the change in the entropy AS > 0 
as T> 0. Consequently, the latent heat of melting reduces to zero, and with 
it, by virtue of (62.6), also dp/dT. Such behaviour of the dependence of 


§62 EQUATION OF THE EQUILIBRIUM CURVE 311 


150} Helium 


100 


p(atm) 











L 
2.0 40 
T(K) 


Fig. I11.37 


dp/dT on temperature does indeed take place for liquid helium I, which is a 
stable phase as T > O at pressures below 30 atm. At pressures above ~30 atm 
the stable phase is solid helium. The phase equilibrium curve (solid helium 
= liquid helium II) is almost horizontal. Its slope dp/dT>0O as T>0 
(fig. 111.37). 

As we have already pointed out, the explicit form of the equilibrium curve 
cannot in general be found. If the dependence of the latent heat of trans- 
formation and of the change in the molar volume on temperature and pressure 
is known, then the Clapeyron—Clausius equation can be integrated. Then the 
dependence of the equilibrium pressure of the phase transition on temper- 
ature, i.e. the form of the equilibrium curve, can be found. The dependence of 
the quantities mentioned on temperature and pressure is usually complex, and 
the integration is carried out numerically. The situation is essentially simpli- 
fied if one of the equilibrium phases is vapour, i.e. in the case of boiling or 
sublimation. 

In the case of equilibrium between a condensed phase and vapour it can 
be assumed that the molar volume of the vapour is considerably larger than 
the molar volume of the condensed phase — a liquid or a crystal. Hence the 
change in the volume in the phase transition can be equated with the volume 
of the gas phase (related to the corresponding number of particles): 

AV=Vyap — Vi 


cond.phase = Vyap 5 
In this case the Clapeyron—Clausius equation assumes the form 


&? n L 
A (62.9) 


vap 


312 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


If the vapour which is in equilibrium with the condensed phase is suffi- 
ciently rarefied, so that it can be considered as an ideal gas, then 


=NKT/p . (62.10) 


Vyap 


It should be noted that this proposition is only fulfilled with sufficient 
accuracy at relatively low pressures. Substituting (62.10) into (62.9), we 
have 


H dT. (62.11) 
p a 


The dependence of L on the temperature can be found by means of a method 
which is completely analogous to that applied in deriving the Clapeyron— 
Clausius equation. 

Differentiating L with respect to T, we have 


dL _.(dAS aAS\ dp , 

aon il) ape 
or 

ab XAN) ap 

a eol ar ep So 





dim TAS 2V yap 5 
m AG Va OT + AS=AC,. 


Thus, the latent heat of transformation at a temperature T is equal to 
T 
L= Lot f (Ac,) dT, (62.12) 
0 


where Lg is the latent heat at T = 0. This quantity represents the work which 
must be done at absolute zero in order to break the bonds existing between 
the molecules in the condensed phase and to transform them into non-inter- 
acting molecules. The latent heat of transformation from the condensed 
phase into the gas assumes a particularly clear meaning: it is equal to the 


§62 EQUATION OF THE EQUILIBRIUM CURVE 313 


work done in order to overcome the bonds plus the energy which must be 

transferred to the system in order to compensate for the difference between 

the energies of thermal motion in the condensed phase and the gas. 
Substituting formula (62.12) into (62.11), we have 








Lo dT 
a + (62.13) 
p NKT? NkT?; 
Integrating (62.13), we obtain 
L 
= Ae 
a= NKT af (06 ») AT" 
or 
Lo wo 
i 2 
p= exp -e NET Sanf e Cy) aT sil, (62.14) 


where į is a constant which is usually called the vapour pressure constant. 
Formula (62.14) shows that the pressure of a saturated equilibrium vapour 
decreases rapidly with decreasing temperature. 
In the case of evaporation the basic part of the latent heat of transforma- 
tion usually corresponds to the first term of (62.12). Hence formula (62.14) 
is often approximately written in the form 


Lo 
p~ exp ne]. (62.15) 


Formula (62.14) involves the unknown vapour pressure constant i, the 
latent heat of transformation at absolute zero, Lo» and the difference be- 
tween the heat capacities of the two equilibrium phases. 

By means of statistical methods the value of all these quantities, except 
Lo, can be obtained mathematically, provided the condensed phase is a 
crystal. The smallness of the pressure of the saturated vapour allows one to 
consider the vapour as an ideal gas and to make use of the chemical potential 
determined by formula (60.5). 

Equating the chemical potential of the gas (for simplicity of the formulae 
a monatomic gas) and the crystal we obtain the equation of the sublimation 
curve. In this case we choose the zero energy in such a way that the energy 
of a motionless molecule of the gas is equal to zero. The energy of a crystal 


314 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch329. 


molecule, measured from this level, is negative (since the molecule is bound 
in the crystal lattice) and will be denoted by €, = —L/N. 

It is obvious that €, is equal to the work which must be done at absolute 
zero to tear a molecule away from its neighbours in the crystal lattice and 
to bring it into the gas phase in a state in which it will also be at rest. Thus, 
—eg represents the heat of sublimation at absolute zero referred to one 
molecule and taken with the opposite sign. Equating Mya. and Heryst, We find 
the conditions of equilibrium in a system (crystal = gas) at low temperature: 


€ 4 3 
=3 mara a 
Inp =3 InkT+j IT 5 l ) a 


c 


or 
5 Lo nw (T 3 
= 2 5 US UE fee 7 2 
p =(kT)2 exp| NKT 5 (ee) Jews, (62.16) 


The basic term in (62.16) is the one which contains the latent heat of trans- 
formation at absolute zero. 

Let us compare formula (62.16) with the general formula (62.14). For this 
we have to calculate the double integral in (62.14). Using for Cp cyst Its value 
from formula (53.2), we have 





AC, = -C, 


A 1274 Nk ( T) 
p p cryst z 0. 


= 
Cy gas = NK s g 


gas 
c 


and 


T nanas 7 
J arto AC, aT" = 


T ' Te 4 ny 3 A 
dT [ene = 12 ne (Z) Jer a 





NKT’? 6 5 b 

Tar nt liki T (IEN 
= ff of (3r - )=4m7 (z) ; (62.17) 
y T2 503 2 Nie 


Substituting (62.17) into the general expression for the pressure (62.14), we 
obtain an expression which is the same as (62.16) provided the constant i 


§63 THEORY OF PHASE TRANSITIONS 315 


contained in (62.14) is assumed to be equal to the chemical constant j. Thus, 
the constant of the pressure of a saturated vapour can be calculated by 
means of statistical considerations. 
Formula (62.16) is in good quantitative agreement with experiment. 
The equilibrium curve at high temperatures can be obtained in exactly 
the same way. 


Equating the chemical potentials (60.5) and (60.8), we find 
£0 


SS “ Ey olan 2) 
Inp zInkT +/+ ry akon Le (62.18) 


The same result can also be obtained from the general formula (62.14). 

It should be noted that, as is seen from (62.16), the pressure of a saturated 
vapour increases very rapidly with increasing temperature. If the characteristic 
temperature 0, of the crystal is relatively large, so that the condition T > 8, 
is fulfilled at high temperatures, then the corresponding density of the 
saturated vapour will be too large for the vapour to be considered as an 
ideal gas. 

In this case it is necessary in formula (62.16) to make use of the chemical 
potential for a van der Waals gas. In practice empirical formulae for the 
vapour pressure curve are more often used. 

It is useful to compare, for the example given, the practical potentialities 
of the thermodynamic and statistical methods. 

We have obtained formula (62.14) which is of very general character and 
which establishes the equilibrium pressure of the vapour above any condensed 
phase by the thermodynamic method. However, this general formula involves 
quantities whose numerical value can be determined only from experimental 
data. 

The expression for the vapour pressure has been obtained by statistical 
methods in the presence of strong restricting assumptions, but within this 
framework the quantitative values of all the quantities have been obtained 
and their molecular meaning has been elucidated. 


§63. Theory of phase transitions 


Up to now we have confined ourselves to thermodynamic reasoning, 


assuming as an experimental fact the existence of phases and the possibility 
of phase transitions. 


We now have to discuss the phenomena of phase transitions from the 


316 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


statistical point of view. A phase transition is always associated with the 
rupture of the continuity of certain thermodynamic quantities. In the ex- 
ample of a phase transition considered above, which is called a phase 
transition of the first kind, thermodynamic potentials remain continuous, 
whereas their entropy S’= (6F/0T), and specific volume v= VN! = 
—N-1(0F/dp)7 undergo a finite jump. 

In addition to phase transitions of the first kind there are also the so- 
called phase transitions of the second kind, in which the second derivatives 
of the thermodynamic potentials — the heat capacity Cp = T(0S/8T),, and 
the coefficient of thermal expansion a = —V-1 (ð V/ðT)p — undergo a break, 
changing discontinuously by amounts AC, and Aa. 

We shall in what follows encounter nei examples of phase transitions 
of the second kind (see Part IV §20 and §21). It should be stressed that the 
very existence of phase transitions appears to be rather unexpected from the 
point of view of statistical physics. One would think that the statistical sum 
(or integral) determines thermodynamic potentials as continuous functions of 
the parameters which characterize the state of the systems, for example, the 
temperature and volume. 

Indeed, we can write the equation of state of a phase of a system with a 
given number of particles M occupying a volume V. It follows that from 
(59.14) that 





pa MAE MDA ARAN. (63.1) 
V yV 
in| 2 BYE, (,r,n)| 
J_N_kT @inZ_1? n 22 (63.2) 
D Z WY TA a lnz ðlnz 


Eliminating z from (63.1) and (63.2), one can find the function f(p, V/N, T), 
ie. the equation of state. If the law of interaction between molecules is 
taken in the form (46.2) and if the phase volume is assumed to have a finite 
fixed value V, then in Z„ one can carry out the integration over the 
momenta and write (in the classical approximation) 


QnmkT\3 1 Dur) 
Z,= (2)? L fexp [- Se nav z (63.3) 


h2 kT 


* In this section we follow the book by K.Huang, Statistical mechanics (John Wiley, 
New York, 1963). For the details of the theory we refer the reader to this book and to 
original studies by Yang and Lee, Phys. Rev. 87 (1952) 410. 


§63 THEORY OF PHASE TRANSITIONS 317 


It is obvious that all statistical integrals Z,, are essentially positive quantities 
which depend on the volume V and temperature T as parameters. We write the 
expression for Z in more detail: 


N 
2= Sy Zn =Z t Zz? Z ION. (63.4) 


n=1 


The grand partition function Ž isa polynomial of the nth power with respect 
to z with essentially positive coefficients. Hence Zisa monotonically increas- 
ing function of the activity z. By virtue of the continuity of functions, 
Z,,(V,T,n) is also a continuous function of the temperature and volume (or 
the density p = N/V). 

From the form of the function Z and formula (63.1) it is seen that p(z) 
is a monotonically increasing function of the activity z, as is shown in 
fig. I11.38a. The inverse specific volume 1/v is also, by virtue of (63.2), a 
monotonically increasing function of z (fig. III.38b). According to formula 
(57.7), the necessary condition for a stable existence of any phase is the 
requirement: (dp/dv)7 < 0, i.e. the requirement of a monotonic decrease of 
the pressure with increasing volume per particle. Therefore the curve p(v) has 
the form shown in fig. I1.38c. 

We see that at given temperature the pressure is a monotonic function of 
the volume and that there is no tendency far the appearance of discon- 
tinuities on the curve which expresses the equation of state of an arbitrary 
phase. 

If, however, the function Z reduces to zero, then according to (63.1) and 
(63.2) the pressure p and the volume v will become indefinite, and our 
reasoning will lose its validity. Therefore it is; necessary to discuss in more 
detail the behaviour of Z as a function of the activity z and the volume v. 


1 
p(z) FA) p (v) 


(a) (b) (c) 


Fig. I11.38 


318 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


The very existence of phase transitions (discontinuities in thermodynamic 
quantities) is associated with the behaviour of the system as Z > 0, where 
the functions p and v have singularities. 

In order to find the values of z for which the grand partition function Z 
reduces to zero, it is necessary to know its explicit expression as a function of 
z and V. Finding the grand partition function for real systems appears, for 
the present, to be an impracticable problem. 

It turns out, however, that in the limiting case where the number of 
particles V in the system and its volume V increase indefinitely (N > %, 
Vc) but in such a way that the specific volume v remains limited to 
UV <vo» it is possible to investigate the behaviour of Zz, v) without deter- 
mining its explicit form. In this case it can be shown that from the behaviour 
of Ze) there follows the possibility of the existence of phase transitions. 

We write the expression (63.4) in the form 


N 
Z(z,V,T) =|] (@=%3) (63.5) 


i=] 


where z; are the roots of the polynomial (63.4). Since all coefficients of the 
polynomial (63.4) are positive, the roots z; cannot be positive. They are 
either negative, complex-conjugate quantities in pairs, or equal to zero. 

Although only the positive and, in the last resort, zero values of z have a 
real meaning, from the mathematical point of view it is convenient to in- 
troduce into the treatment the complex values of z, and to consider the func- 
tion of a complex variable Zz). Then we pass over to the limit in formulae 
(63.1), (63.2) and (63.5), writing 


DERI lim [pinzer.n |, (63.6) 
V> V 
1 1 In (Zz, uo), 
SRN 
T iim [5 a Inz (oo) 
Z(z,V,T) = lim Mes (63.8) 
N-co i=] 


The number of roots z; increases indefinitely, and they are distributed in a 
complex plane. A mathematical study carried out by Yang and Lee showed 


§63 THEORY OF PHASE TRANSITIONS 319 


that there are limits in formulae (63.6) and (63.7) and that for V > © the 
function V~! In Zz) is an analytic function of z in certain regions R of the 
complex plane, including the real axis, which does not contain zero z;. 

This means that in a system with V > © for all values of z in these regions 
the pressure has no singularities (i.e. there are no phase transitions). In 
fig. 111.39 these regions are denoted by R, and R3. Zero z; are denoted by 
solid dots. 

However, in contrast to a system with a finite value of N (a finite number 
of zeros), in a system with M >œ and V > oe the number of zero z; is inde- 
finitely large. Therefore, filling the plane of the complex variable, they can at 
certain points approach arbitrarily close to the real axis. Let zo be such a 
point on the real axis (fig. 111.39). The zero of the function Z is located in an 
arbitrarily small region about the point zo. Since the pressure p is an 
analytic function at the point z = Zg» it must remain continuous (fig. I.40a). 
However, the inverse specific volume 1/v which, according to (63.2), is the 
derivative of the pressure can have a discontinuity at the point Zo: 


a(i)=a3 nZ 
v 





ð Inz 














Dds of the function Ž 


Fig. 111.39 
ail, 
p(z) víz) 
ee : 
Zo “z Zo z 
(a) (b) 





Fig. 111.40 


R. 


TO O 


320 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 





piz) TEZ) piv) 
y OTAN 
my 2 Zo & v 
(a) (b) (c) 
Fig. 111.41 


If such a discontinuity takes place as is shown in fig. III.40b, then the equa- 
tion of state [the function p(v)] assumes the form shown in fig. III.40c. This 
is a typical curve of a phase transition of the first kind. When the specific 
volume changes from v, to v,, the pressure remains unchanged. 

According to the above, a discontinuity in the derivative is possible but 
not obligatory. If, however, the pressure and its first derivative are continuous 
for z=z,, then the second derivative of the pressure 02p/dz? can have a 
discontinuity. In this case a break arises on the curve 1/v(z), as is shown in 
fig. 111.41b. Correspondingly, the equation of state will have the form shown 
in fig. III.41c. Such a curve is characteristic of phase transitions of the 
second kind. 

It should be stressed that the theory which is discussed points to the 
possibility of phase transitions, but allows one to establish neither the 
position of the points of a phase transition nor the character of the transition 
itself. Also, the existence of isolated points zg is not proved. It is important 
therefore, that it has been possible to calculate the partition function in an 
explicit form for the simplest two-dimensional lattice consisting of particles 
which can be in two states (i.e. the Ising model). This calculation is cumber- 
some and cannot be presented here *. It turns out that in such a lattice one 
observes a phase transition of just the same type (with an isolated point zp, ) 
as assumed above. 


§64. Phase equilibrium curves 


The phase equilibrium curve, i.e. the curve of the dependence of the 
equilibrium pressure p on the equilibrium temperature T in the (p,7)-plane, 


* See L.D.Landau and E.M.Lifshitz, Course of theoretical physics, Vol. 5: Statistical 
physics (Pergamon Press, London, 1958). 


§64 PHASE EQUILIBRIUM CURVES 321 


has a different form for different phase equilibria. As we have already pointed 
out, the general form of the curve cannot be determined theoretically. One 
can only make some general observations apropos of this curve. 

In considering the problem of the interrelation between a liquid and a gas 
(§48) we have already pointed out the absence of a fundamental difference 
between these states of matter. The existing qualitative differences between a 
liquid and a gas are associated with different roles of the interaction between 
atoms. As the temperature of a gas decreases or the density increases the 
mean distance between the atoms decreases. This corresponds to a decrease 
in the mean free path and to a relative increase in the mean energy of inter- 
action (in comparison with kT). Under certain conditions the thermodynamic 
potential of a system of widely spaced freely moving particles (a gas) turns 
out to be higher than that of a system in which the distances between the 
molecules are small (a liquid). At this moment a phase transition (condensa- 
tion) takes place. The chaotic free motion of molecules, which is characteristic 
of a gas, becomes a disorderly motion of individual molecules in a “cage” 
formed by their closest neighbours. Although the motion of atoms in a gas 
differs considerably from that in a liquid, this difference is of rather a quanti- 
tative character and, in any case, the nature of the motion does not differ: in 
both cases the motion has a completely random character. From this it 
follows that under certain conditions the transition from a random motion at 
small densities to a random motion at large densities can occur gradually, 
without a jump at the condensation point. In other words, by changing the 
parameters p, V and T in a certain way one can get a continuous transition 
from the liquid state to the gaseous state and vice versa, without a dis- 
continuous phase transition associated with the absorption or release of 
latent heat. 

The possibility of a continuous transition between the liquid state and the 
gaseous state imposes an essential restriction upon the character of the 
liquid—gas phase equilibrium curve. Namely, a continuous transition between 
a liquid phase and a gaseous phase is possible only if the phase equilibrium 
curve p(T) ends at a certain point C (fig. 111.42), which is called the critical 
point (according to Mendeleyev — the point of absolute boiling). 

Let po and T, be the pressure and temperature at the critical point, called 
the critical pressure and critical temperature respectively. For all values of p 
and T lying below p, and 7, the transition from the liquid into the gas and 
vice versa occurs with an intersection of the phase equilibrium curve. On the 
curve itself the two phases are in equilibrium with each other and are sepa- 
rated by a certain interface. Above the point C there is a uniform state of the 
substance in which there are no interfaces. This state is often called the 


322 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 





Fig. 111.42 


transcritical state. The uniform state of a substance can have a large or a small 
density, depending on the temperature and pressure. However, it makes no 
sense to call a substance above the critical point a liquid or a gas. 

The transition from point 1 (liquid) to point 2 (gas) can be performed in 
the way shown by the solid line as well as in the way shown by the dotted 
line in fig. III.42. In the first way the phase equilibrium curve is inter- 
sected, so that the transition is accompanied by the release or absorption of 
latent heat. In the second way the transition takes place via the transcritical 
state and proceeds continuously, without the jump-like change in the charac- 
ter of the motion and without the release or absorption of latent heat. The 
possibility of a continuous transition from the liquid state into the gaseous 
state emphasizes the relative character of the terms “liquid” and “gas”. 

Strictly speaking, use can be made of the terms “liquid” and “gas” only 
when they exist simultaneously and are separated by an interface, i.e. when 
they are phases. 

We shall find the conditions which determine the position of the critical 
point in the (p,7)-plane. Since it lies on the phase equilibrium curve, equilib- 
rium conditions are fulfilled at it, in particular the conditions 


Ap=0, AT=0, 


where Ap and AT are the differences between the pressures and temperatures 
in the phases. 

Near the critical point the difference between the two phases becomes 
small, and at the critical point itself it completely vanishes. In particular, 
the change in the density in the.phase transition is very small and the densities 





§64 PHASE EQUILIBRIUM CURVES 323 


of the two phases are close to each other, in contrast to points lying large 
distances from the critical point. 

If the difference between the densities of the phases is denoted by Ap, 
then one can always write the formal expansion 


= (2? 1 (ap es a?) 3 
Ap (3), 90 +3(5-5), (op ree pert. (641) 


Because of the phase equilibrium the sum of this series is equal to zero. Near 
the critical point the expansion is simplified. Sufficiently near the critical 
point Ap can be assumed to be infinitesimal and in the expansion (64.1) one 
can drop higher terms of the expansion and write 


= an) a 
Ap (2 pe 0. (64.2) 


Since Ap is an arbitrary infinitesimal quantity, it follows from (64.2) that 


op 
=O 64.3 
(20), FEY 
The derivative in (64.3) is taken at the critical point. Thus, at the critical 
point 
GON F(a 
o (35) 0. (64.4) 


If the quantity (ðp/ðV)r reduces to zero, then it is necessary for the 
stability of the substance that the following condition should simultaneously 


be fulfilled: 
ap) 
r |" = 0): 64.5 

(ee T a 


Otherwise the fluctuations of the volume, as we have explained in §57, would 
be infinitely large. The conditions (64.3) and (64.4) determine the position of 
the critical point. They are the same as the well-known conditions for the 
inflection point on the van der Waals curve. 

Near the critical point the substance possesses a number of remarkable 
properties. The differences between the properties of the liquid and gaseous 


324 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


phases progressively decrease and vanish at the critical point. It can be 
shown * that near the critical point the densities of the two phases depend on 
the temperature according to the law 


1 
P= Pe * [const ; (T-T,)] 2 > 


where the plus sign refers to the liquid and the minus sign to the gas. At the 
critical point the latent heat of transition reduces to zero, while the heat 
capacity C, becomes infinite. The surface tension in the liquid—gas interface 
also reduces to zero at the critical point. The fact that the compressibility 
becomes infinite at the critical point leads to an essential decrease in the 
work of compression or expansion of a volume element in the system near 
the critical point. 

Transitions from the crystalline state into an isotropic state have a 
different character. We understand an isotropic state to be the amorphous, 
liquid, or gaseous state. In this case a transition from a state of ordered 
motion into a state of chaotic motion takes place. To an ordered motion of 
atoms (or ions) in a crystal there corresponds the location of the atoms near 
the points of the crystal lattice and associated with this ordered distribution 
a definite crystal symmetry. As the temperature increases an increase occurs 
in the vibrational amplitudes of the atoms at the crystal lattice points and the 
number of violations of the regularity of the lattice increases, but the 
general ordered character of the motion and the symmetry of the lattice are 
preserved up to the melting point (or the sublimation point). A catastrophic 
disruption of the lattice occurs at the melting point, the symmetry vanishes, 
and the ordered motion is replaced by a chaotic motion. 

In contrast to transitions between different isotropic phases, a continuous 
transition between a crystal and one of the isotropic phases is impossible. 
This impossibility is associated with a fundamental difference in the character 
of motion in these phases. This is also seen from symmetry considerations. It 
is impossible to transform, in a continuous way, an infinitely symmetric 
(isotropic) body into one with a definite finite symmetry. The same holds 
for phase transitions between different crystal modifications. Each modifica- 
tion possesses a definite ordered motion of atoms and a symmetry corre- 
sponding to this motion. In a phase transition the changes in the character of 
the motion and symmetry of the crystal occur discontinuously without fail. 
Owing to this a crystal = isotropic-phase equilibrium curve or a crystal = 


* L.D.Landau and E.M.Lifshitz, Course of theoretical physics, Vol. 5: Statistical 
physics (Pergamon Press, London, 1958). 


§65 SURFACE TENSION AND SURFACE PRESSURE 325 


crystal equilibrium curve cannot have an end point and must go off to 
infinity. The transition from one phase into another is always associated with 
an intersection of the equilibrium curve and has the character of a jump *. 


§65. Surface tension and surface pressure 


Up to now we have considered the equilibrium of phases in contact, 
without taking into account the particular properties of the interface and 
their effect on the equilibrium. If, however, the phases which are in equilib- 
rium possess a developed surface, a complete disregard of surface effects can 
introduce a fundamental error into calculations which are carried out. In this 
section we shall take into account the effect of the surface on the phase 
equilibrium. 

Molecules which are distributed in a thin layer directly adjacent to the 
interface are in conditions which differ from those of molecules inside the 
main volume. They interact not only with molecules of their own phase but 
also with the contiguous layer of molecules of the other phase. Owing to this 
the structure and physical properties of the thin layer of substance, whose 
thickness is of the order of magnitude of the radius of the molecular inter- 
action, turn out to be different from those of the main volume. 

A detailed treatment of the properties of the surface layer would require 
a knowledge of the mechanism of the molecular interaction. Such a theory 
would be very complex. Therefore we have to simplify the problem, re- 
placing the surface layer of finite thickness by an idealized infinitely thin 
interface which separates the two phases. Such an idealized infinitely thin 
surface layer we shall, for brevity, call the surface. The area of the surface is a 
new parameter characterizing the state of the system. For a given volume the 
system can have different values of the surface area Ð, a definite state of the 
system corresponding to each value of Ð. 

A change in the surface of a system is accompanied by a gain or ex- 
penditure of energy. In order to form a new surface a particle from the 
volume must be brought to the surface, which requires that work be done. 
We denote by y the generalized force corresponding to the parameter X. 

If a change in the surface takes place at constant temperature, then the 


* For more details about phase transitions see B.G.Levich, Vvedenie v statisticheskuyu 
fiziku (Introduction to statistical physics), (Gostekhizdat, Moscow, 1954) §76, and fora 
particularly full exposition, L.D.Landau and E.M.Lifshitz, Course of theoretical physics, 
Vol. 5: Statistical physics (Pergamon Press, London, 1958). 








326 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


work done in changing the surface (dW=—yd®) is equal to the decrease in the 
free energy (—dF urf)» so that 


dF, 


su: 


ade. (65.1) 


The quantity y, which represents the free energy per unit surface, is called 
the surface tension. The surface tension y depends on the nature of the 
surface (in other words, on the nature of the phases forming it), as well as on 
the temperature. The value of y does not depend, however, on the area of the 
surface. Hence one can write that 


Four= YÈ. (65.2) 


At constant temperature and volume there corresponds to an equilibrium 
state of an unclosed system the minimum value of the free energy. The phase 
contact area represents an example of such a system. Hence the phase contact 
area in an equilibrium state has the minimum possible value of the free energy. 
For a simultaneous change in the value of the surface and temperature in the 
system the change in the free energy of the surface has the form 


dou = Sere dT + dE . (65.3) 


From formula (65.3) it follows that the entropy of the surface is defined by 
the relation 


a- [E] sa 
Sar =- (37), 3 (65.4) 


which allows one to express Surf in terms of the surface tension. 
In addition to the free energy of the surface one can write the expression 
for the energy of the surface £ urf: 


ó 3 Gyan dy 
Egat = Fut * TSt = 72 - TSE = (1-7 HL) (65.5) 


This formula shows that it would be erroneous to define the surface tension 
as the energy per unit surface. 


For the differential of the surface energy we can write 


dE urf = TdS u + dD « (65.6) 


§65 SURFACE TENSION AND SURFACE PRESSURE 327 
From the above equality the relation 


ae) z 
ay a 


> surf 





(65.7) 


sR 


follows. 

It should be noted that (65.4) cannot directly be substituted into (65.7). 
This would lead to the incorrect formula dy/dT = —y/T. The derivative in 
(65.7) is taken for a constant surface energy, but from (65.5) it follows that 


dy SE YÈ — E urf 
dT Th É 


while from (65.4) it follows that 


a Y= ~ E surf 
Sart — aT EO 


Differentiating the last expression for E urf = COnst, we arrive again at 
formula (65.7). 

The change in the surface energy can be resolved into the work done and 
the amount of heat absorbed: 


dE urf = TUS ap + YAE = dQ + dW. 


Taking into account (65.4), it follows from this that the amount of heat 
which is absorbed as the area of the surface is reversibly increased by 1 cm? 
is equal to 


d 
O= (NS) jn -T sh. (65.8) 


The heat capacity of unit surface (for the constant value of the surface area 
£ = 1) can be defined as 


2 

on = (57) EE dete (65.9) 
aT }y=1 dT? 

All quantities characterizing the thermodynamic properties of the surface 


are determined in terms of the surface tension and its derivatives with respect 
to the temperature. 





328 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


The temperature dependence of the surface tension can at present be 
established theoretically only for quantum liquids (liquid helium II). It is 
impossible to separate the motion of the molecules of the surface layer from 
the motion of particles inside the liquid at high temperatures. Hence 
available attempts to calculate the surface tension without taking into 
account this connection are beneath criticism. 

The surface tension changes the phase equilibrium condition. The condi- 
tion of equilibrium in a system consisting of two phases and an interface 
for 


dF = dF, + dF, + dF yp =—p dV, — ppdV + yd==0. 
Since the volume of the entire system remains constant, dV, = —d V} , so that 


—(p;—p2)dV, + yd= =0 


d= 
Py PS ae (65.10) 


The quantity d/dV, represents the curvature of the interface 


Gby st il 


— =—+—, 
dV; ry r 
where r; and ry are the major radii of curvature. In the case of a spherical 
surface 
AE _ d(4nr?) _2 
dVi dnr) 7 ; 


In this case the radius vector is assumed to be positive if it is directed towards 
the first phase. Thus, we have finally 


Jo | 
marsil og (65.11) 


The quantity y(rīt+r3!) is called the Laplace pressure, and formula (65.11) 
is called the Laplace formula. The Laplace formula shows that the pressure 


§65 SURFACE TENSION AND SURFACE PRESSURE 329 


in the first phase is balanced by the sum of the pressure in the second phase 
and the Laplace pressure. In the particular case of a plane interface the 
Laplace pressure reduces to zero, since for a plane surface ry >, ry > o. 

The presence of surface tension changes not only the condition of 
mechanical equilibrium but also the phase equilibrium condition for particle 
exchange. 

That is, although particles cannot be detained in the interface in the case of 
phase equilibrium and can go from one phase into the other without 
hindrance, condition (61.4) will now be fulfilled for a somewhat changed 
value of the pressure (in comparison with the pressure at the point of phase 
transition at the same temperature for a plane interface). 

Let us consider the case of phase equilibrium between the drops of a 
liquid and its vapour. We shall assume the drops to be spheres of radius r. 

The equilibrium condition for the particle exchange will have the form 


u (p",T) = m (p',T), (65.12) 
where p” is the pressure of the saturated vapour, p’ is the corresponding 
pressure in the liquid drop, and T is the transformation temperature. We 


assume the latter to be equal to the transformation temperature in the case 
of a plane surface. The same condition at a plane surface has the form 


H(p, T) = Uo (p,T) . (65.13) 


Subtracting (65.13) from (65.12), we find 


ui @",T) — u (p,T) = u (p',T) — u (p, T) . (65.14) 


Since the compressibility of the liquid is very small, the difference be- 
tween the pressures p’ and p is also small. In view of this one can write that 


bo (p',T) — u2(p,T) = u3 (ptAp,T) — H(p,T) = 


Ou> 
T eee (65.15) 


where v is the volume per particle in the liquid phase, and Ap is the differ- 


ence between the pressures in the liquid for a spherical surface and plane 
surface. This difference is obviously equal to 


330 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 
By 
Ap = > 3 


|| The change in the pressure of the saturated vapour above a spherical drop 
in comparison with a plane surface is in general not small, and the left-hand 
side of (65.14) cannot be expanded in a series. We shall rewrite it, making use 
of the general formula for the chemical potential (60.5) of an ideal gas. 
Assuming the vapour to be an ideal gas, we find 


| DQ” On . 
i. NY per (65.16) 


Formula (65.16) relates the pressure of the saturated vapour above the drop 
to its radius. Writing it in the form 


7 2027 
p =p exp| 22], (65.17) 


we see that p” rapidly increases with decreasing radius of the drop, and that 
for small drops it can become considerable. 

For example, for water drops with y ~ 80 erg/cm2, r= 10-6 cm and 
T ~ 300K the pressure p” ~ 1.1p, i.e. the pressure of the vapour above the 
drop exceeds that above a plane surface by 10%. 

Formula (65.17) shows that a system consisting of a set of drops of 
different size is in a state of unstable equilibrium. Small drops possessing an 
excess energy which is associated with the surface tension will evaporate, 
while the vapour will condense on big drops. This process, which is called the 
distillation of drops, will go on until the entire liquid transforms into drops 
of the largest size. 

Another important phenomenon, which is associated with the change in 
the pressure of the vapour above a curved surface, is observed in capillary 
tubes wetted by a liquid. In such capillary tubes the surface of the meniscus 
will be concave, so that one has to write a minus sign in formula (65.17). The 
pressure of the saturated vapour above a concave surface turns out to be 
smaller than that above a plane surface. Owing to this, condensation of the 
vapour takes place in fine capillary tubes before it does above a plane surface 
at the same temperature. This phenomenon is called capillary condensation. 

Formula (65.17), which has been derived for equilibrium in a liquid— 
vapour system, is qualitatively valid also in other cases of phase equilibrium, 
for example, a crystal—vapour system. 





| 
| 


§66 GAS ADSORPTION 331 
§ 66. Gas adsorption 


One of the most important effects associated with the particular properties 
of the phase contact area is adsorption. By adsorption is meant the accumula- 
tion of a substance on the surface of a solid or liquid phase. As a rule the 
adsorbed substance is distributed over the phase contact area and does not 
penetrate at all into the condensed phase. The phenomenon of adsorption is 
observed for many different combinations of phases. In practice one most 
often has to deal with the adsorption’of gases on the surface of a solid body. 
In this section we shall confine ourselves to the treatment of this case of 
adsorption. 

For the present, the nature of the forces which bind molecules adsorbed 
on a surface with the molecules of the solid or liquid backing (called the 
adsorbent) cannot be considered as well established. In a number of cases 
they have the character of van der Waals forces, in other cases a stronger 
bond is established between the adsorbed molecules and the molecules of the 
adsorbent, corresponding to the formation of a particular chemical combina- 
tion. Intermediate cases are also possible. As a rule, adsorbed molecules are 
distributed on the surface of the adsorbent in the form of a monomolecular 
layer. Adsorbed molecules most often possess no mobility, and in the process 
of adsorption they are fixed to quite definite points on the surface of the 
crystal. We shall call such points on the surface of a crystal at which the 
adsorption of molecules takes place sites. The sites can be the edges of the 
faces of a crystal or any other outstanding points on the surface. 

We denote the number of sites per cm? of the surface of the crystal by 
N,. We assume that all sites on the surface are equivalent, so that the ad- 
sorbed molecule is bound to the surface at each of them to the same degree. 

We write the partition function for a system of adsorbed particles, 
assuming that their density NV, (the number of particles per cm? of the 
surface) is small and that the interaction between adsorbed molecules can be 
disregarded. Every one of the adsorbed particles possesses a potential energy 
(—u) which is a measure of the work done in removing the particle from the 
surface. 

If it is assumed that the adsorbed molecules are oscillating about an 
equilibrium position with a frequency v, then the partition function for each 
molecule can be written in the form 


= ET: 
ZA m eulk Zoscil > 


where Zoscil is the partition function of the oscillator. 





332 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


The partition function for the entire set of adsorbed particles can be 
written in the form 


Ny! 
ZA = N NJ Ca) ep 
Ny! 

where the factor V,!(N; —N,)! represents the statistical weight of the state. 

Indeed, to a given energy of the system there correspond a number of 
states which differ in the distribution of the particles over sites. The number 
of ways in which NV, particles can be distributed over N, sites is equal to 
Ni ![N; !OV -N4 )!]. It is this number which gives the statistical weight of 
a state of a given energy of the system of particles. 

By means of (66.1) one can find the free energy and the chemical poten- 
tial of the system. We have for the free energy 


a N! NA 
Fa =—kT InZ, = —kT ln | -rr hz = 
A a NA INL NAJ! í a) 

N N 
= —N, kT In — +kTN, In— + 

e e 


NENA. 
+ (NLN) kT In ~—* _N kT In z4 - (66.2) 


Correspondingly, the chemical potential of adsorbed particles is equal to 
[see (60.3’)] 


dF, N My, — Na 





aa A = 
Ha = aN, kT ln A kT ln 5 kT \Inz, = 
Na 
pe NSN, —-kT Inz, . (66.3) 


By means of formula (66.3) one can consider the equilibrium between an 
adsorbed substance and the gas. Equating Ha and the chemical potential of 
the gas given by formula (60.5), we obtain 


Na 
AEM NR + f(T)=kT Inp+¢(T), 


§66 GAS ADSORPTION 


w 
w 
we 


whence we find 


MV Aine, p 
Ny, -Na Po(T) : 





(66.4) 
where po depends only on the temperature. Solving (66.4) for Na , we obtain 


Niue ESS 5 
Na MTO (66.5) 
Formula (66.5) represents the adsorption isotherm: it determines the number 
of adsorbed molecules as a function of the pressure of the gas above the sur- 
face of a solid body at a given temperature. It is obvious that at low pressure, 
when p <Po> this number is proportional to the pressure of the gas and to 
the number of sites: 


p 
Na =N : 
A L Py(T) 


In this case the degree of occupation of the sites is small. At a large pressure 
p > Po the phenomenon of saturation occurs, the number of adsorbed mole- 
cules ceases to depend on the pressure and becomes a constant equal to the 
number of sites: 


Na FNL. 


The adsorption isotherms of formula (66.5) are shown in fig. 111.43. 

At high temperatures a mobility of the adsorbed molecules across the 
surface arises. In the limit the adsorbed molecules can move across the surface 
of a solid body like the molecules of a “two-dimensional” gas. Then the form 
of the partition function z4 of the adsorbed molecule changes, but the form 
of the adsorption isotherm remains as before. 

It should be noted that the simplest mechanism of adsorption considered 
here, for which all sites on the surface are characterized by one and the same 
binding energy (—u) and which leads to the isotherm (66.5), is seldom en- 
countered. Usually there are different sites on the surface of a solid body with 
different values of the binding energy, i.e. the surface of the adsorbent is 
non-uniform. Moreover, when sites are densely occupied the interaction be- 
tween molecules in the adsorbed layer becomes important and affects the 
shape of the isotherm. As a result of the superposition of these factors the 


334 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 











D 
x= 
E 80 
E 
O 
o 
ñ 60 
p 
O 
N 
» 40) 
v 
oO 
£ 
5 20 
{Eat | 1 fi 1 J 
o 10 20 30 
p (bar) 
Fig. 111.43 


form of the isotherm can change and be very different from that shown in 
fig. 111.43. By determining the form of the isotherm from experimental data 
and considering it as a result of the superposition of the isotherms (66.5) 
with different Po), one can obtain information about the non-uniformity 
of the surface. 

As to gas adsorption on the surface of a liquid, the mechanism of adsorp- 
tion here does not differ from that on the surface of a solid body at high 
temperatures; all points on the surface of a liquid are equivalent and the 
adsorbed particles are mobile on the surface for a relatively sparse population. 
The adsorption isotherm has the form (66.5). When the surface is very 
densely populated the interaction between the adsorbed molecules, which 
distorts the form of the isotherms, begins to play an important role. 


§67. Chemical equilibrium in the gas phase 


As the second example of a system with a variable number of particles we 
shall consider equilibrium in a system in which a chemical reaction occurs. 

We assume for concreteness that in the course of the chemical reaction 
atoms A and B combine to form a molecule AB. The molecule AB in its turn 
dissociates into individual atoms A and B. Both processes, that of the combi- 
nation and that of the dissociation, have a certain rate, by which is meant the 
number of events occurring per unit time. The rate of the direct process 


§67 CHEMICAL EQUILIBRIUM IN THE GAS PHASE 335 


A+B > AB is in general not equal to that of the reverse process AB > A+B. 
Hence the chemical reaction proceeds mainly in one direction. However, after 
the lapse of a certain time, when reagents arising in the course of the more 
rapid reaction have accumulated while the amount of reagents vanishing in 
the course of the reaction have decreased the rate of the rapid reaction will 
decrease whereas that of the slow reaction will increase. As a result an equili- 
brium state will be established in the system. The number of molecules AB 
produced and disintegrated will be the same. In this case one speaks of the 
equilibrium reaction A+B + AB. It is more convenient to write the reaction 
in the form of an equality. If the reaction takes place between several sub- 
stances, then each chemical reaction can be written in the form 


vig; =0, (67.1) 


where the g; are the chemical symbols of the reacting substances, and the v; 
are the number of reacting moles of the corresponding substances. We choose 
to write reactions in such a way that the coefficients v; for substances which 
are used up in the course of the reaction have a negative sign, while those for 
substances produced have a positive sign. For example, the reaction of the 
formation of water vapour from the mixture 


2H + O2 = 2H,0 
must from this point of view be written in the form 
2H,0 — 2H, — O2=0, 


so that 


Yy,0 = 2; i SB and VO wae all 


Let us write the conditions of chemical equilibrium in an arbitrary 
system consisting of initial substances and reaction products. The set of mole- 
cules of the initial substance and the set of molecules of the reaction 
products can be considered as certain quasi-closed systems which are in a 
reservoir and are weakly interacting with each other. The latter condition is 
fulfilled if the number of atoms reacting per unit time is small in comparison 
with the total number of molecules in the system, which is always the case in 
a macroscopic system of substances for an equilibrium reaction. 











336 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


One usually investigates the equilibrium states of a reacting system at a 
given temperature and pressure. The equilibrium condition is the requirement 


G(p,T,N;) > min , 


where N; is the number of particles of a given kind. For constant given 
values of the temperature and pressure in the entire system the condition of 
the minimum can be rewritten in the form 


0G 0G 
ac-(22) dN “(cee dN, +...= >> M,N; =0 
aN, pT 1 aN pT 2 >, bet 


and, noting that the change in the number of particles of a given kind can be 
written in the form 


dN; = v;dN , 


we obtain the equilibrium condition in a system in the presence of chemical 
reactions: 


Doe =O. (67.2) 


When one molecule of the first subsystem transforms into one molecule of 
the second subsystem (the case which we have considered before) the 
coefficients v; are obviously equal to vj = 1, v3 = -1. In this case formula 
(67.2) turns out to be identical with (61.4). 

We see that chemical equilibrium is determined by the equality of the 
chemical potentials. 


§68. The law of mass action 


In order to apply the conditions (67.2) to definite cases of chemical 
equilibrium it is necessary to know the explicit form of the chemical 
potentials. The latter are known mainly for gases. Therefore the theory 
which follows will refer to chemical equilibrium in a mixture of gases. We 
have, in §60, calculated the chemical potential of a gas. In a mixture of ideal 
gases each of the gases behaves as though it alone occupied the entire volume 
of the container and has a chemical potential p; = N;p/N, where N; is the 
number of particles of the ith gas and X is the total number of atoms of all 
kinds in the container. We write u; in the general form: 


§68 LAW OF MASS ACTION 337 
Hi = kT Inp;+x 71), (68.1) 
where 


X(T) = —$KT InkT — kTj, , 


a 2am 3 
jy =In ic ] (68.2) 
n2 


for a monatomic gas, 





X(T) = —3KT InkT — kTj + KT In(1—eFKT) + eg , (68.3) 


oprta 2mm 3 872I 
ja =1n | | = 
he yh 


for a diatomic gas at very high temperatures when vibrations are excited, 
and 





=a Pe ; 
X(T) =~ kT InkT — kT jz + € 
for a diatomic gas at not very high temperatures, when vibrations are not 


excited. 
Let us consider a reaction of the type 


181 + Vaga — 383 = 0. 


The chemical equilibrium condition reads 





VjM] + Val — v3H3 =0 (68.4) 
or 
vıkT Inpy + v2kT Inpo —v3kT Inp3 = 3x3 — ¥1X1 — 2X2 - (68.5) 
Thus, 
pi Py (68.6) 
ease 





338 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


where 


Y3X3 — VIX] — ¥2X 
Ink) == >". (68.7) 


The quantity K(7) depends only on the temperature and nature of the react- 
ing molecules but not on the initial pressures or the amount of the reacting 
gases. 

Formula (68.6) is called the law of mass action. The law of mass action 
shows that, irrespective of the initial composition of the reacting gas mixture, 
in the course of time there is established in it an equilibrium state for which 
the partial pressures have quite definite values which are related by formula 
(68.6). They do not depend on any parameters except the temperature, the 
difference between the zero point energies, and the chemical constants of the 
reacting gases. In the case where the reaction involves not three but a larger 
number of gases the mass action law must be written in the form 


)*i 
m Pid" KT) ; (68.8) 
p)” 
where the product is taken over all the gases taking part in the reaction, and 
the primes refer to reaction products . 

The law of mass action was first discovered experimentally by 
N.N.Beketov, and it was introduced theoretically, on the basis of statistical 
considerations, by Guldberg and Waage. 

The expression (68.8) has a clear statistical meaning: in order that the 
initial substances may react it is necessary that their molecules should 
simultaneously be in a very small volume v, whose size is of the order of 
magnitude of the diameter of the molecules. Since the gases are assumed to be 
ideal and the motion of the molecules is independent of one another, the 
probability that the molecules of the initial substances will simultaneously be 
found in a given volume is proportional to the numbers of these molecules in 
the gas. These numbers are in their turn proportional to the corresponding 
partial pressures. Thus, the probability of the direct reaction w, is propor- 
tional to el Fe sA 


wi =ap\' p7 ie 


The same reasoning can also be applied to the reverse reaction. The probability 
of the reverse reaction w> is equal to 


§68 LAW OF MASS ACTION 339 
w = b(py")”! 2)” . 


In an equilibrium state the rate of the direct reaction is equal to that of the 
reverse reaction. For this the probability of the direct process must be equal 
to that of the reverse process. Equating w, and w, and denoting by K the 
ratio of the factors of proportionality a/b, we arrive at formula (68.8). 

The mass action law represents the basic law of chemical equilibrium. It 
can be derived in a purely thermodynamic way, but then the value of the 
constant K(7) remains indefinite and must be found experimentally. By 
means of the statistical expressions for the u; which have been presented 
above, the constant K can be calculated theoretically. We shall give an 
example of such a calculation somewhat later. 

The law of mass action is most often expressed not in terms of partial 
pressures but in terms of the so-called molar fractions: 


c;= p;p - 


Substituting c; into (68.6), we obtain 
=p?" K(T). (68.9) 


From formula (68.9) it follows that, if the reaction proceeds without a 
change in the number of moles, so that v3 =v} + v3, the equilibrium does not 
depend on the total pressure p in the system. As an example of such a reac- 
tion we point to the reaction of the dissociation of hydrogen iodide 


—2HI +H, +1,=0, 


for which v] =1, v3=1, ¥3=—2. 

If the reaction proceeds with a change in the number of moles, so that 
v3 #V, +, then the change in the total pressure shifts the equilibrium. 
This means that the ratio between the molar fractions of the initial substance 
and the reaction product changes as the total pressure changes. Let, for 
example, the dissociation of the N04 molecule into two NO, molecules 
takes place. We write the reaction in the form 


2NO, == N3204 =0. 


The coefficients of the reaction are v =2,v =—1, so that the reac- 
NO, N204 


ert. 
a 


=o 


340 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 


tion proceeds with an increase in the number of moles. The law of mass action 
reads: 


2 
NO, 


=p | K(T)- 
CN204 





As the total pressure decreases the number of NO, molecules increases, i.e. 
the percentage of dissociated NO, molecules in the equilibrium mixture 
decreases. Thus, if the reaction proceeds with an increase in the number of 
moles, |v3| >v, +, then a decrease in the total pressure favours the reac- 
tion, whereas an increase in the pressure hampers it. In the case of reactions 
proceeding with a decrease in the number of moles, |v3| < vy + v3, the change 
in the total pressure acts in the reverse direction. 


§69. Thermal dissociation of atoms 


We have mentioned earlier the thermal dissociation of atoms which occurs 
at very high temperatures. When the temperature reaches such high values 
that the thermal energy kT becomes comparable with the energy required to 
tear an electron away from an atom (the ionization energy), then thermal 
ionization of atoms takes place. Atoms dissociate into positively charged 
ions and electrons which form the corresponding ideal gases. In addition to 
the process of ionization the reverse process also takes place: recombination, 
in the course of which an ion and an electron combine to form a neutral 
atom. Thus, at very high temperatures two reactions take place in the 
substance: 


AvlI* +e, ItT+e>A, 


where A denotes the atom and I* denotes the ion (for simplicity we confine 
ourselves to the case of single ionization). If constant conditions of temper- 
ature and pressure are maintained in the system, then an equilibrium state 
will be established in which the number of dissociations is equal to the 
number of recombinations. A system in which an equilibrium reaction 


A=I*+e, 
takes place in principle does not differ at all from a system in which an 


equilibrium chemical reaction takes place. For such a system the law of mass 
action can be written in the form 


§69 THERMAL DISSOCIATION OF ATOMS 341 


CC DS 
ag oe KT) =D, (69.1) 


where Ce, Cyt, Ca are the molar fractions of the electron gas, ion gas and 
atomic gas respectively. The constant K(7) is given by 


Ae 


InK(T) = ZKT + je — = - 


(69.2) 


In formula (69.2) all quantities referring to the ion and the atom have been 
cancelled, since the difference between the mass of the ion and the mass of 
the atom can be neglected. The chemical constant of the electron gas is equal 


l; 


The quantity Aeg represents the ionization energy of the atom. Thus, 


Aép K(T) 
exp($InkT+ je — a) 2. (69.3) 





< 
© 
! 

Si 
(E 
e, 

N 
ri] a 
vis 
° 
x 
niw 





Instead of molar fractions it is more convenient to introduce another, 
more obvious quantity, which is called the degree of dissociation. Let an 
ath part of atoms undergo dissociation, so that M(1+a) particles arise out of 
N atoms. The quantity œ characterizing the fraction of ionized atoms is called 
the degree of dissociation. Obviously, we have 





Substituting molar fractions expressed in terms of the degree of dissociation 
into (69.3), we find 





Cae Ke 
l-a2 P 


whence 





1) =n LS) Ls E T 





342 SYSTEMS WITH VARIABLE NUMBER OF PARTICLES Ch. 9 
1 2 \3 Ae 1 
P |-2 Dp h 3 0 |-2 
= +2 = + = 9. 
j S [! d [! (kT) (5) PE | (e273) 
j 


From this formula it is seen that the degree of dissociation increases 
rapidly with temperature. Estimates of the numerical values of the quantities 
contained in the coefficient of the exponential factor in the denominator 
show that this factor is very small. 

Hence, provided Aeg/kT is not very large, the entire expression under the 
square root, and also a, is of the order of unity. This means that for values 
| of kT which are comparable with the ionization energy the gas turns out to 
i be practically completely ionized: 


tt axl; Ce 7 5 Ce XT Ca XO. 


The degree of ionization also increases with decreasing total pressure. This 
is in complete agreement with what was said at the end of §68 for reactions 
proceeding with an increase in the number of moles. 

A well-known application of formula (69.4) is in the elucidation of a 
peculiar feature of the spectrum of the solar atmosphere which at first sight 
seems to be very strange. A method of investigating spectra which originate 
from different layers of the solar atmosphere (chromosphere) has been 
developed in astrophysics. Investigations have shown that in the deeper 
layers of the atmosphere, where the temperature is higher, the degree of 
dissociation of the vapour of calcium is lower than that in outer, cooler 
layers. In the case of calcium the ionization potential amounts to 6 eV. The 
degree of ionization œ at 6000 K and a pressure p = 1 atm amounts to only 
8%, whereas at the same temperature and a pressure of 10-2 atm it reaches 
65%. The explanation lies in the fact that, owing to the effect of the 
coefficient of the exponential term containing p in formula (69.4), the 
increase in the degree of dissociation with decreasing pressure is more rapid 
than its decrease with decreasing temperature in going from the deeper to 
the outer layers of the solar atmosphere. 





| 


a 
à = 


10 





Statistical Distributions in Quantum 


Statistics and Some of Their Applications 


§70. The identity of elementary particles and the calculation of the partition 
function 


We have already pointed out more than once that classical concepts turn 
out to be inadequate for the study of the motion of atomic systems and 
that they must be replaced by the concepts of quantum theory. In ch. 1 we 
presented the minimum amount of information from quantum theory which 
was necessary for the subsequent exposition. However, for a more profound 
analysis of those changes which are introduced into statistical physics by 
quantum theory it is necessary to dwell on certain important results of 
quantum theory. 

As we have already stressed, the following two propositions of quantum 
mechanics are of basic importance for statistical physics: 

(1) the existence of the discrete states of a system, 

(2) the principle of identity of elementary particles. 

We have from the very beginning taken into account the discreteness of 
quantum states. We have established when it is necessary to take into account 
the discrete character of energy levels, and when they can be considered 
approximately as distributed continuously. Also, we have shown the effect 
of the discreteness of the energy spectrum on the behaviour of statistical 


343 





| 
| 
i 


344 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


systems. However, up to now we have not taken the identity of particles 
into account in a systematic way. True, we have considered as one state those 
states differing from each other only by a permutation of the particles. For 
this we have divided the phase space by the number of possible permutations 
of the particles. This division represented the simplest attempt of taking into 
account the identity of particles. As a matter of fact, the division by X! was 
carried out even before the appearance of quantum theory. Otherwise, as we 
have already pointed out in §37, incorrect expressions for thermodynamic 
functions were obtained from statistics. Considerations based on the principle 
of identity of elementary particles justified to a certain degree the division of 
the partition function by N!. However, the inconsistency of this operation is 
obvious. Indeed, we have assumed in the beginning that all particles are 
different from each other, so that one can in principle number them, ascribing 
a definite number or label to each particle. Proceeding from this point of 
view, we have calculated the possible states of a system consisting of N inde- 
pendent particles, integrating with respect to the coordinates and momenta 
of the first, second and so on particles. Thereupon, in contradiction with the 
initial assumption of the possibility of numbering the particles, we have 
proclaimed that the part of the states differing from each other only by a 
permutation of the particles to be completely identical and have required that 
all of them should be taken into account as one. 

Experiment and theory show, however, that the identity of atomic 
particles has a much more profound character. The complete identity of 
atomic particles leads to the fact that the first of the operations which we 
have carried out — the numbering of particles — loses any physical meaning. 
It makes no sense to call one of the particles the first, another the second and 
so on and to integrate them over states, since there are no physical differences 
between the first, second and so on particles. If one calls the particle which 
is at the initial instant in a definite state the first particle, then at a subsequent 
instant it would already be impossible to assert that it is just the first particle 
which is in this state, because it would be impossible to distinguish the first 
particle from a “‘not first? particle. Hence it is necessary from the very 
beginning to renounce any attempt to distinguish between individual atomic 
particles, i.e. to renounce the property of a system of atomic particles which 
we have adopted *. 

We shall now see what changes in the statistical distribution result from 
taking account of the complete identity of atomic particles. All further 
discussion will refer only to a monatomic ideal gas. 


* Exceptions are the so-called systems of localized particles which are separated from 
each other by impenetrable barriers. We shall not consider such systems. 


§71 DERIVING THE STATISTICAL DISTRIBUTION 345 
§71. A second method of deriving the statistical distribution 


In order to derive the statistical distribution in a gas, taking into account 
the principle of the identity of elementary particles, we shall have recourse 
to a special method which is characteristic of the versatility of statistical 
methods. To make the difference between the classical and quantum con- 
siderations particularly striking, we shall first introduce the classical distri- 
bution (the Maxwell distribution) by means of this method. 

Assume that the molecules in a gas can be in individual quantum states 
with energies of translational motion €} , €2, €3, ... (for the convenience of the 
treatment we shall for the present assume that the energy levels are discrete). 
In the gas there is a certain distribution of particles over states, such that in 
the first state there are nį particles, in the second state there are 7 particles, 
and so on. From the classical point of view we have to describe the states of 
the gas in the following way: 

particles No. 1, 2, 3, ..., nį are in the state with energy €}, 

particles No. nyt, ..., Myo are in the state with energy E2, 

particles No. nį +5+1,...,%,+2+"3 are in the state with energy €} and so 

on. 

Further, we choose as a quasi-closed subsystem all particles which are in a 
certain arbitrarily chosen quantum state with energy e€,. All the remaining gas 
particles, which are in other energy states, then form a reservoir. 

The choice of a group of particles which are in a given state as a subsystem 
is in complete agreement with those requirements which must be satisfied by 
a quasi-closed subsystem (§13). Indeed, as a result of collisions the particles 
possessing the energy €, go over into other states. Conversely, owing to the 
same mechanism other molecules, which earlier had an energy differing from 
e, and which, consequently, belonged to the reservoir, can go into the state 
with energy €g. If the number of molecules which are coming in or going out 
of the subsystem per unit time is small in comparison with the number of 
particles in it, then it can be assumed that the interaction between the sub- 
system and the reservoir is weak. This condition will be fulfilled if the 
collisions between the particles which cause the corresponding transitions 
occur sufficiently seldom, i.e. when the gas is rarefied. Since the interaction 
of the subsystem with the reservoir consists in the passage of particles from 
the subsystem to the reservoir and vice versa, the subsystem which we have 
chosen represents an example of a subsystem with a variable number of 
particles. It differs from the general case of a subsystem with a variable 
number of particles and a variable energy by the fact that the energy of each 
particle in the subsystem is fixed. However, the energy of the subsystem, 


itt 
Mi 


346 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


which is made up of the energies of all the particles contained in it, of course, 
also changes as the number of particles in it changes. 

In order to avoid misunderstanding we stress that we are now speaking 
not of the states of an actually existing system but of the states of a 
conditionally introduced subsystem. Our subsystem is not a single system 
but a set of particles with a definite energy which are at different loci of the 
gas and are not bound to each other. 

In the process of variation of a state of a real system the number of 
particles getting into the state with the given energy e, changes. In this sense 
the state of our subsystem varies. The energy of the subsystem is equal to 


C= Goias (71.1) 


where ng is the number of particles in the subsystem. The quantity e varies 
together with ng. 

In order to characterize completely the state of the subsystem we have 
chosen, it is necessary to know the mean number of particles in it, i.e. the 
mean number of particles which are at the energy level we have chosen. For 
this calculation use can be made of the general formula (59.11) which gives 
the mean number of particles in a subsystem with a variable number of 
particles. In our special case this formula can be essentially simplified. We do 
not have to introduce the double summation over the possible values of the 
energy and the number of particles, because in our system the value of the 
energy is unambiguously determined by the number of particles contained in 
it according to formula (71.1). In summing over the possible values of the 
number of particles in the subsystem we automatically carry out the summa- 
tion over the possible values of its energy. As a matter of fact, it is with this 
simplification that our choice of the subsystem has been associated. Thus, the 
mean number of particles 7, in the subsystem is expressed by the formula 





te a H—Ep\"k 
ng = KIN Sp In Day (exp iT ) Qing) (71.2) 
Nk 
where instead of e we have substituted its expression according to formula 


(71.1). 

The summation in (71.2) is carried out over all possible values of the 
number of particles ną in the subsystem. In order to actually perform the 
summation in formula (71.2) it is necessary to know the explicit expression 
for the statistical weight (the number of states) 0(m;) of the state of the 
system when it contains ną particles. In quasi-classical statistics, where all 


§71 DERIVING THE STATISTICAL DISTRIBUTION 347 


particles can be numbered, the states of our system will always be degenerate, 
provided the system contains more than one particle of a given kind. Indeed, 
if the system contains M identical particles, then it can be in states which 
differ from each other by a permutation of the particles. For example, let 
there be in our gas two molecules, No. 1 and No. 2, with energy eg. In the 
first state molecule No. 1 is at point 1 while molecule No. 2 is at point 2. In 
the second state the positions of the molecules are exchanged. The energy of 
the subsystem in the two states is the same and equal to 2e,. Thus, there 
are two states of the system with an energy 2e% or, in other words, the states 
of the system are two-fold degenerate. In the general case the states of a 
system containing 77, particles are ,!-fold degenerate. 

If we do not wish to consider states differing only by a permutation of the 
particles as different states (which would undoubtedly lead us to incorrect 
expressions for the thermodynamic functions), then for Q use should be 
made of the general formula (1.26) and the total volume of the phase space 
must be divided by the number of possible permutations of molecules 7%! 
and the size h3 of a cell. 

The volume of phase space corresponding to one quantum state with an 
energy € is obviously equal to h3. Hence for Q(n,) we finally obtain 


1 
ON) = Tis (71.3) 


Substituting the expression (71.3) into formula (71.2), we find 





Hate 


r (11.4) 


N 
Tp =kT 2 in 2> ala exp 
ðu ng! 
ng=0 


The number of particles nų contained in the subsystem can vary from zero 
to the total number of particles in the gas M. However, the probability that 
all particles of the gas be found simultaneously in one energy state is ex- 
tremely low. Hence for n which are close to X the terms of the sum (71.4) 
are so small that the sum rapidly converges. Therefore we shall not commit 
an error if we replace the upper limit V in the sum by infinity. This corre- 
sponds to the addition of infinitesimal terms to the sum. For such a sub- 
stitution the sum (71.4) goes over into a simple series: 








348 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 
co co 
1 u—e,\"k xn 
> a (exp = X= et, (71.5) 
= top 6 wa 
ng=0 n=0 


where we have provisionally denoted exp(u—e,;)/@ by x. Thus, 


mck BG: 


kT =exp kT (71.6) 


H 
exp 





= ð ð 
Ny Sd Pei = i 

In particular, if the states of the molecule and its energy vary continuously, 
which is always valid in classical statistics, then instead of a given energy level 
€, one has to consider states with an energy lying between e and e + de. 
Then instead of the number of particles nọ in a given quantum state one 
needs an expression for the mean number of particles with an energy between 
e and e + ĝe, which we shall denote by dn. Obviously, 


zit = (epte) 
dn na (exp kT p3’ (71.7) 


where dy is the volume of phase space corresponding to an energy between e€ 
and e+ôe, and dy/h? is the number of states with this energy. Formula 
(71.7) is the same as the Maxwell distribution in the form it had in formula 
(60.10). 

In later sections we shall make use of the statistical distribution (71.2) to 
obtain the quantum laws of the distribution of molecules in an ideal gas. 

In conclusion we note that the method of deriving the Maxwell—Boltzmann 
distribution described here is often called the method of cells in phase space. 


§72. Quantum distributions for an ideal gas 


As we have just stressed, it follows from the principle of identity of 
particles that one cannot distinguish between individual microscopic particles; 
electrons, photons, protons and other elementary particles or atoms or 
molecules *. 


* In the latter case it is actually the same atoms or molecules, which behave iden- 
tically in all possible fields of force, that are identical with one another. Atoms or mole- 
cules differing in any way, for example, containing nuclei of different isotopes or being 
in different rotational states, must be considered as particles of quite different kinds. 


§72 QUANTUM DISTRIBUTIONS FOR AN IDEAL GAS 349 


Applying the point of view of the identity of particles in a consequential 
way, one has to renounce the numbering of particles. One can then no longer 
speak of “two states differing by the exchange of two particles” or of “n! 
identical states differing by a permutation of n particles”. We have to speak 
of “a state with an energy e, in which there are respectively two particles or 
ng particles”. 

Instead of indicating the state of the entire gas by listing the particles in 
different energy states, one has to indicate the number of particles in each of 
these states, i.e. to indicate that there are 

nı particles in the state with energy €}, 

no particles in the state with energy €>. 

Thus, the description of the state of a gas turns out to be less detailed than in 
classical statistics. 

Since one cannot speak of a permutation of particles in a given state, the 
division by n! makes no sense. Each state, irrespective of the number of 
particles which are in it, has the same statistical weight, namely the weight 
equal to unity. 

The change in the method of calculating states leads to a basic change in 
the form of the statistical distribution. To obtain the latter we shall make use 
of the method of the preceding section. The mean number of particles in a 
state with an energy e€ is given by the formula 


=i a ae 
=kT— sa 2 
ng = kT rf In Dy (exp KT Q(nz) - (72.1) 
nk 


Now, however, another value of Q(n¿) must be substituted into (72.1). Since 
the state of a system containing an arbitrary number of particles O<n, <% 
is non-degenerate, and the necessity of dividing by ng! disappears, for the 
number of states of a system containing mn, particles we have, instead of 
(71.3), 


Qn) =1. (72.2) 


We have again assumed Ay = 43. Substituting this value of Q(ng) into (72.1), 
we find 


n papel} H—€x\"k 
ny =kT au In DY (exp kT ) ; (72.3) 
nk 





a 


350 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


The summation is carried out over the number of particles in the state with 
energy €,. In performing the summation it is necessary to distinguish between 
two kinds of particles, about which we have spoken in § 1; particles which do 
not obey the exclusion principle, and those which do. 

In the first case no restriction is imposed upon the number of particles 
which are in a given state. Their number can take on all integer values be- 
tween zero and the total number of particles in the system. Thus, 


a HE Nk 
ny, =kT — In (ex = : 72.4 


Replacing the upper limit of the sum by infinity, we obtain 


= a ae) "k 
-=kT > ln EXD ‘ 72.5 
Ty au | 2, ( P -ET (72.5) 


If the following inequality holds 


ki 72.6 
eX, (72.6) 





exp 


then the sum in (72.5) represents an infinitely decreasing geometric progres- 
sion and can easily be calculated. Namely, 


co 








H—Ep\ "k M= 
DS (exp z) =(1 exp ar) 
ng=0 
whence 
my = kT in (1 ay =( Sekaa) ro 72.7 
ny = Bul n {1—-exp exp Fp à (72.7) 


Formula (72.7) gives the mean number of particles in the ideal gas which 
are in a state with energy €¢, if the particles do not obey the exclusion 
principle. To this class of particles there belong atoms having zero spin, the 
molecules of saturated compounds also with zero spin and, in addition, as 
will be shown below, light quanta. The distribution (72.7) is called the Bose— 
Einstein distribution. 


§72 QUANTUM DISTRIBUTIONS FOR AN IDEAL GAS 351 


It should be noted that, since the sum (72.5) must always converge for 
any value of the energy, in particular for €, = 0, in addition to the inequality 
(72.6) 


e#/kT < | (12.8) 


must also hold. The inequality (72.8) shows that the chemical potential for 


particles obeying the Bose—Einstein distribution must be an essentially nega- 
tive quantity: 


u<o. (72.9) 


It should be recalled that in the Boltzmann distribution u is also an essentially 
negative quantity, but is always very large in absolute value (see §60). 

In the case of particles obeying the exclusion principle the number of 
particles ną which can simultaneously be in an individual quantum state 
cannot exceed unity. Consequently, the possible values of ng are restricted to 
two: ng = 0 and n% = 1. Substituting unity for the upper limit of the summa- 
tion in formula (72.1), we have 


1 


H—E; \”k 
Ty = kT 5 In D ( ‘) = 


exp 
nkg=0 jest 











Hh a IN E-H -1 
= E Ta In (1+exp kT ) = (exp kT + ) : (72.10) 
The distribution (72.10) represents the distribution of the particles of an 
ideal gas over states in the case where the particles obey the exclusion 
principle. The distribution (72.10) is called the Fermi—Dirac distribution. 

In all cases encountered in practice the spacing between the energy levels 
of the translational motion is so small in comparison with the thermal energy 
kT that the energy spectrum can be assumed to be continuous. Then instead 
of the mean number of particles at the kth energy level it is necessary to 
introduce the mean number dn of particles with an energy lying between € 
and e+e. It is obvious that dn = nh’ dy where h~3dy is the number of 
states corresponding to an energy in the interval e, e + ôe. Substituting the 


mean value of the number of particles which are in one state from (72.7) 
into (72.10), we find 


rar 


= 


ah ep 


— 


352 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 
e—ųu -1 dy 
dn = {exp —— +1 =E, TPR 
( P ET ) n3 (72.11) 


where the plus sign refers to the Fermi distribution, while the minus sign 
refers to the Bose distribution. The chemical potentials figuring in the Bose 
and Fermi distributions are determined from the normalization condition 


fan=n, (72.12) 


which expresses the constancy of the number of particles in a given volume. 

Comparing the derivation of the Bose distribution and the Fermi distribu- 
tion with that of the Maxwell—Boltzmann distribution, we see first of all that 
the two distributions represent a realization of the Gibbs distribution for the 
case of an ideal gas whose particles obey the laws of quantum mechanics. 
There is a profound difference between the statistical distribution laws for 
particles obeying the laws of classical mechanics and for particles obeying the 
laws of quantum mechanics. This difference is not associated with any change 
in the statistical laws and not even with taking account of the discrete charac- 
ter of the energy spectrum, but with a radical change in the method of 
calculating the statistical weight of states. The difference between the 
methods for the calculation of statistical weights in classical statistics and the 
two quantum statistics is associated with the principle of identity of particles 
and is due to a profound difference between the behaviour of classical 
mechanical systems and the behaviour of atomic particles. 

The difference between statistical weights in the Bose statistics and Fermi 
statistics is due solely to the difference between the laws of quantum mechanics 
obeyed by particles with an integer spin and those obeyed by particles with a 
half-integer spin. In this sense the often applied terminology “the Maxwell— 
Boltzmann classical statistics” or “the Fermi—Dirac quantum statistics and the 
Bose—Einstein quantum statistics” should be recognized as most inadequate. 
In reality the different forms of statistics are not in question but the different 
laws of quantum mechanics obeyed by the corresponding particles, i.e. the 
two forms of quantum mechanics: that for particles with an integer spin and 
that for particles with a half-integer spin. Statistical laws in all cases remain 
completely invariable. If the particles obey the laws of quantum mechanics 
for particles with an integer spin, the application of the laws of statistics 
leads to the Bose—Einstein distribution for the particles of an ideal gas. But 
if the particles obey the laws of quantum mechanics for particles with a half- 
integer spin, then the same statistics leads to the Fermi—Dirac distribution. 
Finally, if the particles obey the laws of classical mechanics, one obtains the 
Maxwell—Boltzmann distribution for a system of these particles. 


§72 QUANTUM DISTRIBUTIONS FOR AN IDEAL GAS 353 


The following question naturally arises. Experiment shows that the motion 
of atomic particles is described by the laws of quantum mechanics. Hence the 
behaviour of any ideal gas consisting of atomic particles must be described by 
one of the quantum distributions (72.7) or (72.10). Hence, is the Maxwell— 
Boltzmann distribution simply incorrect and not valid for real gases? A 
negative answer can be given to this question even without analysing the 
distributions (72.7) and (72.10), on the basis of general considerations. The 
laws of quantum mechanics are the laws of motion of particles including the 
laws of classical mechanics as a first approximation. Under certain conditions 
the laws of classical mechanics are a sufficiently good approximation; within 
the limits of this approximation it can be assumed that the motion of particles 
obeys the laws of classical mechanics. Consequently, there must exist also 
such conditions where the Maxwell—Boltzmann distribution reflects with a 
sufficient degree of accuracy the actual behaviour of ideal gases. We shall call 
gases obeying classical statistics non-degenerate gases. Conversely, ideal gases 
whose behaviour is governed by quantum laws are unified under the general 
name degenerate gases. 

Our first problem is the discussion of the question of under which condi- 
tions a gas is degenerate and under which conditions it is non-degenerate or, 
in other words, when classical statistics can be considered to be applicable 
and when the laws of quantum statistics must be taken into account. In 
solving this problem we shall proceed from the fact that the Bose—Einstein 
distribution and the Fermi—Dirac distribution are more accurate laws. Hence 
use can be made of the Maxwell—Boltzmann law only when the difference 
between it and the quantum distributions (72.7) and (72.10) becomes 
sufficiently small. 

Comparing the distributions (72.7), (72.10) and (71.6) we see that they 
have, in the general case, an essentially different character (fig. 111.44). How- 
ever, this difference vanishes if the following inequality is satisfied: 

Gp 

exp q7” it (72.13) 
All three distributions have the same functional form when the inequality 
(72.13) is fulfilled. The lack of coincidence of the curves in fig. [11.44 for 
large € is associated with the fact that to these curves there correspond 
different u. In this case the one in the denominator in (72.7) and (72.10) 
can be neglected, and the Bose and Fermi distributions automatically go over 
into the Maxwell—Boltzmann distribution. Thus, the inequality (72.13) is the 
condition of applicability of classical statistics. 

For the fulfillment of the inequality (72.13) at energies € ~kT (for 


354 


STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS 





Fig. 111.44 


€>kT the exponential rapidly increases) it is necessary that e 
Assume that this inequality is satisfied, so that for all energies 


(exp ot i) =~ exp a 


Then from the normalization condition (72.12) we find 


Z fex, Hae Px Wy dp,dV 
Se ee 
3 
= 22M V app 
h3 


3 
z y (27xkT)? eulkT 
h2? | 


= 1 
f e“SkT eî de = 
0 


whence 
erwkT = ap (ez)? l 
h2 


Thus, the criterion of the validity of the application 


3 
Pele (an S 
h2 


(72.14) 


of classical statistics is 


(72.15) 


Ch. 10 


HIKT > ], 





§72 QUANTUM DISTRIBUTIONS FOR AN IDEAL GAS 355 


If the reverse inequality is fulfilled, then degeneracy arises and one cannot use 
the Maxwell—Boltzmann distribution. 

We see that the criterion (72.15) contains several parameters. First of all, 
it contains the mass m of the particles; the larger the mass, the larger the 
left-hand side of the inequality. Further, (72.15) contains the density of the 
gas and its temperature T. As was to be expected, the inequality (72.15) is 
satisfied for high temperatures and is violated for low temperatures, so that 
at low temperatures quantum effects must appear. The fulfillment of the 
inequality (72.15) is also favoured by a small gas density. 

In the inverse limiting case, when 


V (2nmkT \3 
cera ie ts (72.16) 


degeneracy of the gas arises. Thus, the degeneracy can be due to the 
following causes: (1) a small particle mass, (2) a large gas density, (3) a low 
temperature. 

In order to estimate the order of magnitude of the quantities, we shall 
consider two numerical examples. 

Suppose we have an electron gas. The mass of the electron is m= 
9.1 X 10-28 g. We assume that the density of the electron gas is such that 
1 cm contains 6 X 102 particles. Then it turns out that the condition of 
degeneracy is fulfilled up to temperatures of the order of 2000—3000 K. 

In the case of atomic hydrogen, which is the lightest gas, degeneracy can 
arise only at very low temperatures and high densities, because the mass of 
a molecule of hydrogen is larger by a factor of 3700 than that of the electron. 
These temperatures and densities are considerably lower than those for 
which the interaction between atoms leading to condensation of the gas 
becomes important. 

Thus, only in the case of an electron gas of high density can degeneracy 
occur at relatively high temperatures. 

Another case of the quantum (degenerate) gas is the photon gas whose 
properties will be discussed in §76. 

We shall not dwell in more detail here on the properties of the Bose and 
Fermi distributions, since it is more advisable to discuss them for real physical 
systems (the electron gas and the photon gas). 

For all ordinary gases the difference between quantum statistics and 
classical statistics for not particularly large values of the temperature and 
density turns out to be negligibly small. The two quantum distributions, the 
Bose and Fermi distributions, can, with a high degree of accuracy, be re- 


=> 


356 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


placed by the Maxwell distribution. The absence of any difference between 

Bose statistics and Fermi statistics can be understood if it is taken into 

account that the order of magnitude of the mean number 77; of particles in an 

individual quantum state can be estimated by the following relations: 
ny © et KT ~ eulkT <1. 

In a non-degenerate gas for a high temperature and a small density of the 
gas the density of occupation of states is very small. In each state there is on 
the average much less than one particle, hence it is of no importance whether 
or not two or more particles can get into one state; even a pair will practically 
never in any event get into it. 

In spite of the fact that classical statistics can always be applied to atomic 
gases (or, more precisely, quasi-classical statistics, since it is inevitabie that 
we take into account discrete energy levels and introduce the factor 1/N!), 
only the creation of quantum statistics made it possible to solve a.number of 
the most important physical problems. Some: of these will be discussed in 
following sections. 


§ 73. Black-body radiation 


The statistical theory of radiation played a very important role in the 
discovery of quantum theory. The classical electromagnetic theory of light, 
which explained a wide range of phenomena associated with the propagation 
of light and which was generally recognized by the end of the 19th century, 
encountered in the beginning of the 20th century insuperable difficulties in 
connection with the problem of the emission of light, in particular, with the 
problem of thermal radiation. We understand thermal radiation to be the 
whole of the radiation emitted by a heated body. 

As is well known, the character of the light emitted, in particular its 
intensity as well as the dependence of the intensity on the frequency (the 
spectral composition of the radiation), is determined by the temperature and 
nature of the emitting body. 

There are, however, cases where the spectral composition of the radiation 
does not depend on the nature of the emitter and is determined solely by its 
temperature. This is for so-called equilibrium radiation. Imagine a closed 
cavity with walls which do not conduct heat and which are maintained at a 
definite temperature 7. The walls of the cavity will emit and absorb electro- 
magnetic waves. Since all of the electromagnetic radiation is confined within 


§73 BLACK-BODY RADIATION 357 


the closed cavity, a state of statistical equilibrium will be established in the 
system after a certain time. The walls of the cavity will emit as much electro- 
magnetic energy per unit time as they absorb. A system of standing electro- 
magnetic waves, stationary in time, will exist in the cavity. 

The energy density of the corresponding electromagnetic field inside the 
cavity will be given by formula (12.6) of Part I: 


_ B2 + H2 
G 87 j 


The thermal radiation will contain different frequencies. The energy density 
p(v) in a given frequency interval dv will, obviously, be different for different 
frequencies. The energy density of the radiation will also depend on the 
temperature T of the emitting walls. Thus, 


p= pl,T). 


A simple thermodynamic consideration shows, however, that p(v, T) does 
not depend on the nature of the emitter, in particular, on the nature of the 
walls (their absorption and emission properties, the state of the surface and 
so on). 

Consider two cavities whose walls are heated to the same temperature but 
are made of different materials. We assume that the spectral energy density 
of the radiation depends on the nature of the emitter and is different in the 
two cavities. Then, by connecting the two cavities, one can disturb the state 
of equilibrium. Radiation will pass over to that cavity in which the density 
of radiation is smaller. As a result, the density will increase in this cavity, the 
walls of the cavity will absorb more radiation, and their temperature will 
increase. A difference between the temperatures of the walls of the two 
cavities will arise, which can be used for obtaining useful work. 

The assumption which we have made leads to the conclusion of the possi- 
bility of spontaneous violation of the equilibrium state in a closed system and 
the possibility of constructing a perpetual motion machine of the second 
kind, which is, as is well known, impossible. Thus, it is proved that the spec- 
tral distribution of the energy density p(v,7) of the equilibrium radiation is a 
universal function of the frequency v and the temperature T. 

A study of the absorption and emission properties of material bodies led 
Kirchhoff to the establishment of a very important theorem known as the 
Kirchhoff theorem. 

The quantity E(v), the energy emitted per cm? of the surface of a body 


358 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


per second with a frequency between v and v + dv per unit frequency inter- 
val, will be called by us the radiative capacity of an arbitrary body. 

Further, the part of the entire radiation energy with a frequency between 
v and v+ dp incident on 1 cm? of the surface of a body which is absorbed 
inside the body * per unit frequency interval will be called by us the absorp- 
tive capacity of the body. 

The Kirchhoff theorem reads that the ratio of the radiative capacity to the 
absorptive capacity E(v)/A(v) is a universal function of the frequency and 
the temperature of the body but depends neither on the nature and properties 
of the body nor on its geometrical dimensions, i.e. 


EQ) _ 
AG TOD: (73.1) 


It turns out that the universal function f(v,7) is connected by a simple 
relation with the energy density of the equilibrium radiation p(y, 7) (T is the 
temperature of the body): 


Cc 


SOT) = <7), 


where c is the velocity of light. Thus, the Kirchhoff theorem can be written 
in the form 


C 


Ev) _ 
AO) oR pY,T) . 


The proof of the Kirchhoff theorem, which is of a very general character, 
can be found in any course on radiation theory **. 

Since the absorptive capacity of a body can be found without any special 
difficulty from a measurement of the absorption coefficients and from 
geometrical considerations, finding the form of the function p(v,7) appeared 
to be of great interest. From the Kirchhoff formula (73.1) it follows that a 
body with an absorptive capacity A(v) equal to unity is of special importance. 


* Not to be confused with the absorption coefficient, which characterizes the absorp- 
tion of light per unit path in matter. The value of the absorptive capacity characterizes 
the absorption in the entire volume of the body. 

** See, for example, M.Planck, Vorlesungen iiber die Theorie der Warmestrahlung 
(Theory of thermal radiation) (Barth, Leipzig, 1913). 


§73 BLACK-BODY RADIATION 359 


Such a body absorbs the entire electromagnetic energy incident on it for all 
frequencies. It is called an absolute black body. 
For an absolute black body we have 


EQ) = = p(v,T) . (73.2) 


Formula (73.2) shows that the absolute black body has a higher radiative 
capacity than all other bodies. Its radiative capacity is a universal function 
of the frequency v and the temperature T. 

By measuring the radiative capacity of an absolute black body, one can 
determine experimentally the form of the function p(v, T). 

Of course, not all bodies encountered in nature are absolute black ones. 
No matter what the nature of the surface of the body may be, a certain part 
of the radiative energy incident on it is reflected. However, the closed cavity 
filled with radiation which we have considered above is an absolute black 
body. Indeed, the entire radiation emitted by the walls of the cavity is also 
absorbed by them. If a small opening is made in the cavity, then by studying 
the spectral distribution of the radiation energy coming from this opening, 
one can find the function p(v,7) experimentally. The size of the opening must 
be sufficiently small that radiation leakage through the opening does not lead 
to a fundamental departure from the equilibrium state. By means of such a 
model of an absolute black body the spectral energy distribution at different 
temperatures was investigated experimentally. 

Fig. III.45 (see §75) shows typical curves of this kind. The wavelength of 
the radiation is plotted on the horizontal axis, while the energy density of 
the radiation p(A,7) corresponding to a wavelength A is plotted on the vertical 
axis. The energy density of radiation at a given wavelength is connected with 
p(»,T) by the following relation: 


p(v,T) dv = p(X,T) dà . 
Taking into account that 


da 


dv=c 2 





we have 


iG 
p(A,T) = 2 pT) . 


Mi 


= Bt 


360 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


Different curves in fig. II1.45 refer to different temperatures. All the 
curves have a characteristic form. For long wavelengths the density of the 
radiation decreases with increasing A. At a certain wavelength A,,,, it has a 
maximum, and then it again tends to zero in the direction of short wave- 
lengths. The position of the maximum shifts in the direction of short wave- 
lengths as the temperature increases. 


§74. The classical theory of black-body radiation 


We now pass on to the calculation of the spectral distribution function 
p(v,T). 

The electromagnetic radiation in a closed cavity forms a system of standing 
waves. We have considered such an electromagnetic field in §38 of Part I, 
where it has been shown that it can be replaced by a set of equivalent field 
oscillators. The energy of the field turned out to be equal to the sum of the 
energies of the oscillators. In the case of radiation in the cavity one has, 
corresponding to what has been said above, to ascribe to it a temperature 
which is equal to the temperature T of the radiating walls. We can therefore 
say that to each standing wave in the cavity there corresponds an oscillator 
with a frequency v and an energy e(v,7) which depends on the frequency as 
well as on the temperature T. 

Every one of the oscillators which replace a system of standing waves can 
be in different states and can have a different energy e(v,7). We shall, 
however, be interested not in the instantaneous but in the mean energy 
e(v,T) of the oscillators. Here the averaging is carried out over all possible 
states of the oscillator. The energy of the standing waves in unit volume of 
the cavity, whose frequencies lie between v and v + dv, is numerically equal 
to the mean energy of all the oscillators which replace the normal oscillations 
and have frequencies in the same interval. If g(v)dv is the number of oscilla- 
tors, then the aforesaid can be written in the form 





p(y, T) dv = e(v,T) g(v)dv . (74.1) 


We have found the number of natural oscillations in §38 of Part I. In the 
case of electromagnetic waves it is necessary only to take into account that 
they are polarized and can have two directions of polarization. Formula 
(38.22) of Part I gives the number of oscillations with a frequency between 
v and v + dp for each direction of the polarization. For both directions of the 
polarization the number of oscillations must be doubled: 


§74 CLASSICAL THEORY OF BLACK-BODY RADIATION 361 
_ 8m€ 5 
p(v,T)dv = sae dp. (74.2) 
c 


The derivation of formula (74.2) did not involve any concepts of the 
quantum theory; it was obtained before the creation of the quantum theory. 
For the mean energy of the oscillator € its classical value 


e=kT 


was taken, and the density of the equilibrium radiation at a temperature T 
was written in the form (the Rayleigh—Jeans law) 


8nkT 
c? 





pv, T)dv = v2dv’. (74.3) 


The senselessness of formula (74.3) is obvious. It shows that the energy 
density of the electromagnetic field in a closed cavity increases monotonically 
with increasing frequency. Since oscillations of all frequencies, in particular 
poe, can arise in the cavity, formula (74.3) leads to an infinitely large 
energy density as v > o: 


E= | pT). 
0 


This result means that radiation sources confined in the cavity would have to 
emit radiation until the entire thermal energy contained in them went over 
into the radiation of the field and their temperature fell to absolute zero. 
Thus, for example, if a burning hot solid body placed in the cavity were the 
emitter, then from the result obtained it follows that equilibrium in the 
emitter—electromagnetic-field system would only be established after the hot 
body had cooled to absolute zero. 

This conclusion has a simple meaning. According to the law of equiparti- 
tion of energy, all degrees of freedom are equivalent and in an equilibrium 
state the same energy corresponds to each of them. The thermal energy 
contained in a crystal consisting of V atoms can be assumed to be distributed 
between 3A oscillators. The electromagnetic field in the cavity can also be 
considered as a set of oscillators. However, the number of these is immeasur- 
ably larger than 3N. The wave numbers of possible standing waves in a closed 
cavity which has the form of a cube must satisfy the conditions 


1 


; 





362 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


1k _ tks 


_ aki 
ear? emer 8 


where L is the size of a face of the cube, and k,, kz, k3 are numbers running 
over a series of integer values from zero to infinity. These conditions are 
equivalent to the conditions (38.11) for the crystal, but in the latter case the 
values of k}, ky and k3 are restricted by the number of particles N. Thus, 
the number of standing electromagnetic waves in the cavity and the corre- 
sponding number of electromagnetic field oscillators is infinitely larger than 
the number of oscillators needed for the description of thermal motion in a 
crystal. In an equilibrium state the entire energy must be contained in the 
field, since the same energy must correspond to each oscillator. 

This result is in complete contradiction with experimental data. Experi- 
ment shows that the density of the thermal energy contained in the crystal is 
immeasurably larger than the density of the energy of the electromagnetic 
field. For example, at T = 300 K the density of thermal energy in a solid body 
turns out to be a factor of 10!4 larger than the measured energy density 
inside a cavity with radiation. As to the spectral energy density distribution, 
expressed by formula (74.3), it turns out to be in agreement with the mea- 
sured energy distribution in the black-body spectrum for small frequencies 
which satisfy the condition Av <kT. On the contrary, for large frequencies, 
when hv > kT, p(v,T) increases with the frequency v much more slowly than 
according to the law v2. 

Thus, the law of equipartition of energy leads to a complete disagreement 
of theory with experiment when applied to the problem of black-body radia- 
tion in the range of large frequencies. Historically, this was the first well 
investigated case of the inadequacy of classical concepts. The glaring contra- 
diction with experiment to which classical statistics led, incited contem- 
porary physicists to call the situation “the ultraviolet catastrophe”. A way 
out of the contradiction was found by introducing the quantum theory. 


§75. The Planck formula 


The simplest, although not the most obvious method (from the physical 
standpoint) of finding the spectral distribution function p(v,7) taking into 
account quantization consists of the following. 

We substitute into formula (74.2) the value of the mean energy of a field 
oscillator calculated according to the theory of the quantum oscillator. We 
drop the zero energy of the oscillator $hv, choosing it to be the origin of 
energy. Then 


§75 PLANCK FORMULA 363 
= nhv (75.1) 


and 


hv D n ewhen/kT a hv 
Èy e-hin/kT ehw/kT =] 





€= (75.2) 


Substituting (75.2) into formula (74.2), we find the following expression for 
the mean energy of the vacuum electromagnetic field per unit volume for a 
frequency lying between v and v + dp: 


_ 8rhv3 dv 
ply, T)dv = c3(elw/kT_1) (75.3) 


Formula (75.3) is called the Planck formula. This formula was first derived 
semi-empirically, since formula (75.2) for the energy of an oscillator was 
unknown. Formula (75.3) and the Planck constant h contained in it were 
found from experiment. 


In the two limiting cases, hv/kT <1 and hv/kT > 1, the Planck formula is 
simplified. In the first case 


noian m1 + BY. 
e lr 


and formula (75.3) reduces to the form 


8k 


n v2dp , (75.4) 
73 


plv, T)dv ~ 





i.e. goes over into the classical formula (74.3) for the mean energy density 
of black-body radiation. 
For hv/kT > 1 


(cv/kT_ 1)! ~ enhv/kT f 


so that 


3 
p(v,T)dv ~ z e-hv/kT dy , (75.5) 
G 


p 


364 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


This last formula is called Wien’s law. 

Going from p(v,7) to the spectral distribution of the radiation density 
with respect to wavelength, p(A,7), we can write formulae (75.3), (75.4) 
and (75.5) in the following forms: 


p(A,T) = 2 CN (75.6) 
8akT } 5 

pan = (4 < i) (75.7) 

neo ~ Sale ehelkTA (i > 1) l (75.8) 


The curves corresponding to formula (75.6) are shown in fig. III.45. For 
large wavelengths p(A,7) decreases with increasing wavelengths as \~4, while 
for small wavelengths p(A,7) tends to zero as AS e7~"C/KTA. The function 
p(A,7) has a maximum at a wavelength Ama, which can be found from the 


condition 

















A (um) 


Fig. 111.45 


§76 STATISTICS OF THE PHOTON GAS 365 


dp(,T) 


ane ao 





or 


5 1 he etc/KTX 
6 elic/KTA a Nee (chtc/KTA_ 1 )2 





Denoting the quantity Ac/KTA,,,, by x, we can write the last equation in the 
form 


- eX 
eee 
e —] 


The solution of this transcendental equation gives 
x= 4.96, AT © 2X 1073 he/k . (75.9) 


Formula (75.9) shows that the position of the maximum of the energy density 
of black-body radiation shifts in the direction of small wavelengths with 
increasing temperature. This is the so-called displacement law. 

The value of the quantum constant 4 can be determined from the dis- 
placement law. After the proper choice of this value the Planck formula turns 
out to be in excellent agreement with experimental data. 


§76. The statistics of the photon gas 


As we have already pointed out in the introductory chapter, contemporary 
quantum theory, in accord with experimental data, states that radiation 
possesses corpuscular as well as wave properties. Although from the point of 
view of ordinary ideas it is impossible to combine the properties of a wave 
and a particle in one object, one sometimes has to make use of the wave 
aspect and sometimes the corpuscular aspect in order to explain different 
optical phenomena. Thus, for example, in the phenomena of interference or 
diffraction the wave nature of radiation is manifested, whereas in the photo- 
electric effect or the scattering of hard X-rays the corpuscular nature of 
radiation is revealed. From the corpuscular point of view, radiation can be 
considered as a flux of light quanta, or photons, moving in space with the 
velocity of light c. Photons arise and vanish when light is emitted and 





366 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


absorbed by atoms respectively. Their energy is equal to e = AF, where AE 
is the difference between the energy levels of the emitting system. 

All photons move in vacuum with the same velocity, but different photons 
can have different energy and different momentum. The energy and momen- 
tum of photons are related by the expression 


€=/pe,; (76.1) 


which is a general formula relating these quantities for any object moving with 
the velocity of light. The energy and momentum of a photon depend on the 
frequency according to the formulae 


e€=hv=hw, E (76.2) 
p =hv/ce = ħwfe . (76.3) 


Like other material particles, photons also possess angular momentum 
(see §1, as well as Part V). 

It turns out that as an emitting system (an atom or a molecule) emits 
radiation its angular momentum must decrease by an amount which is a 
multiple of A. The corresponding angular momentum is carried away by the 
outgoing photon. Thus, the angular momentum expressed in units of A is an 
integer. Like all other particles with an integer angular momentum, photons 
obey Bose—Einstein statistics. 

From the corpuscular point of view equilibrium radiation filling a closed 
cavity must be considered as a photon gas filling the volume V of the 
container. The particles of the photon gas are moving randomly in all direc- 
tions in the container, and their directions of flight change in collisions with 
the walls of the container. There is no interaction between the photons. 
Hence the photon gas must in its properties be similar to an ordinary mole- 
cular ideal gas filling a closed container. 

However, in addition to the similarity between the photon gas and the 
molecular gas there is also a very profound difference between them. The 
most fundamental difference between the photon gas and the molecular gas 
is the fact that for the photon gas one cannot speak of a fixed number of 
particles. In contrast to ordinary particles (electrons, protons or atoms), 
photons can arise or vanish at the moment of emission or absorption of light 
by atoms. Hence the number of photons in the cavity cannot be considered as 
fixed. 

Another difference between photons and gas molecules is the fact that the 


§76 STATISTICS OF THE PHOTON GAS 367 


former all move with the same speed. In fact, however, this property of the 
photon gas is not associated with the specific nature of photons. For very 
large values of the kinetic energy of all particles their velocities approach the 
velocity of light and the differences between the velocities of individual 
particles are progressively smoothed over. Hence this difference between the 
photon gas and a molecular gas is not essential. Of importance only is the 
fact that in the photon gas, as well as in a molecular gas, there is a certain 
distribution of particles over momenta and energies. 

Finally, there is one more difference between the photon gas and a gas of 
ordinary particles, a difference of principle rather than of a practical charac- 
ter. As has been shown in §6 and $16, the establishment of the velocity 
distribution (or the momentum distribution) of molecules is closely associated 
with the interaction occurring between them in molecular collisions. But 
photons do not collide with each other at all. An equilibrium distribution of 
photons can be established only in the case where there is a body in the 
cavity which can absorb and emit photons. In the process of absorption and 
subsequent emission photons of one frequency transform into photons of 
other frequencies. In this case the number of photons does not remain 
constant, although their total energy must be conserved. In particular, the 
walls of a cavity containing a “photon gas” can act as such a body. 

We shall now show that, proceeding from the concept of a photon gas, 
one can arrive at the Planck formula with the same success as using the wave 
concept. 

From the corpuscular point of view the function p(w,7) can be inter- 
preted in the following way. In the frequency interval w, w + dw, or in the 
corresponding energy interval of the photons e, e +de, let there be dQ 
quantum states of photons per unit volume. Further, let the mean number 
of photons in each state be equal to 7(e). Then the mean number of photons 
with an energy between € and e+de is equal to 


dn = n(e)dQ . (76.4) 
Their mean energy is equal to e(¢e)dQ. But this energy represents none other 
than the energy density of radiation with a frequency between w and w + dw. 
Thus, 


ple, T)dw = en(e)dQ . (76.5) 


Our problem is to find 7(e), ie. the mean number of particles in a gas of a 
variable number of particles. Taking into account the fact that photons are 





368 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


particles with spin equal to unity, then for 7(€) one can write the Bose— 
Einstein distribution. 

It is necessary, however, to take into account the peculiar feature of the 
photon gas which is associated with the possibility of absorption and emission 
of photons by the walls of the container or by material bodies placed inside 
the cavity. 

The number of particles in a photon gas is variable and depends on the 
state of the gas. Hence, in contrast to an ordinary molecular gas, the free 
energy of a photon gas depends not only on the variables V and T but also 
on the number of particles X in the gas. For a given value of V and T the 
number of photons in an equilibrium state will have a value Vg such that the 
free energy F(V,T,No) has its minimum value. Thus, it can be said that an 
equilibrium state of a photon gas occurs when the following equality is 
fulfilled: 


OF(V,T,N) _ E 
a UAS (76.6) 


Here we have made use of formula (60.3’). 

Eq. (76.6) shows that the chemical potential of an equilibrium photon 
gas is zero. 

Thus, the mean number of particles of a photon gas per unit volume which 
have an energy e, must, by virtue of (72.7) and (76.6), be written in the form 


eee 
n(é) = ware, (76.7) 
The number of photons with an energy between e and e+ de is equal to 


2 4np*dp _ 8m de 
ekr _} 73 h3c3 e/kT 





dn(e) = (76.8) 


The factor 2 is introduced to take into account the two-fold degeneracy of 
the states of photons with a given momentum p. For a given value of p there 
are two states corresponding to the two possible independent directions of 
polarization of light. 

The total number of photons in the equilibrium radiation can be found by 
integrating (76.8) over all values of e. 

Making use of formula (76.2), € can be expressed in terms of the frequency 
w. Performing this substitution and passing over from the number of photons 
to their energy, we find 


§76 STATISTICS OF THE PHOTON GAS 369 
p(w, T) dw = en(€e)dQ = 


_ 8&7 ebde KÀ hie dw 
h3¢3 e/kT _ | m2¢3(e@htw/kT_}) ° 





(76.9) 


i.e. the Planck formula. 

It should be stressed that if the Boltzmann distribution and not the Bose— 
Einstein distribution were applied to photons, then instead of the Planck 
formula one would obtain formula (75.5) which is valid only when 
hw/kT > 1. Indeed, substituting the expression 7 = e~/KT = e~hw/kT for 
n(e) from (76.7), we obtain Wien’s law (75.5) instead of the Planck formula 
(76.9). 

Thus, classical statistics cannot be applied to photons. The region of 
applicability of classical statistics to the photon gas is restricted by the 
condition Aw/kT > 1. This condition is the reverse of that for classical 
Statistics to be applicable to electromagnetic field oscillators. Thus, for 
high frequencies (or low temperatures) it is the corpuscular properties of 
radiation that are predominant, whereas for low frequencies (or high 
temperatures) it is the wave properties of radiation that are predominant. 

The energy of electromagnetic radiation or the energy per unit volume of 
a photon gas is obtained from (76.9) by integrating over all frequencies: 
a ho 7 wide 
oe i Re D825 a J ehwlkT 1° 





To calculate the integral we introduce a new variable, x = hw/kT. Then 


ñ i i x3dx 
u=—— |— ; 
n2c3 \h ‘ es 


= 





The value of this integral is obtained in Appendix IV: 
f See 
g e-1 55 

Hence finally we have 


aE ou oe 
15(he)3 





(76.10) 





370 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


The energy density of black-body radiation turns out to be proportional 
to the fourth power of the absolute temperature (the Stefan—Boltzmann law). 

The constant a figuring in (76.10) contains only the universal constants 
h, c and k. The Stefan—Boltzmann law is widely applied in heat engineering 
for the calculation of the radiative capacity of heated surfaces. Although 
radiators encountered in practice are not black bodies, the application of the 
Stefan—Boltzmann law leads to good results for all solid radiators except 
metals. For the latter the emitted energy increases as a higher power of the 
temperature. 

The total energy of radiation in a volume V is equal to 


E=aVT‘ . (76.11) 


Further, we find the free energy of black-body radiation. According to 
formula (30.12), we have 


EdT 
o opie ee (76.12) 


The entropy of radiation is 
=—dF/dT =4aVT? . (76.13) 
And the radiation pressure p is equal to 
p = —əF/ðV = 5aT4 =5E/V. (76.14) 


Radiation pressure was first discovered by P.N.Lebedev. This discovery was 
of fundamental importance. It allowed proof of the impossibility of con- 
structing a perpetual motion machine of the second kind in which radiation 
could be used as the working medium. 

Radiation pressure, which is very small in terrestrial conditions, assumes 
an extremely great importance in astrophysics. As is shown by formula 
(76.14), the radiation pressure increases very rapidly with increasing temper- 
ature. At very high temperatures, such as occur in astrophysical conditions, 
the radiation pressure turns out to be larger than the gas pressure, and plays a 
basic role in a number of astrophysical processes. 

Finally, a simple calculation shows that the thermodynamic potential 
® of radiation is equal to zero: 


§77 PROPERTIES OF LIQUID HELIUM II 371 
b=F+pV=0. 


This is in agreement with our requirement u=0 for the photon gas. 


§77. The properties of liquid helium Il 


A very interesting example of a macroscopic system in which quantum 
effects are displayed is liquid helium II, the only system which remains 
liquid down to absolute zero. All other liquids solidify at temperatures 
which are too high for quantum effects to be shown. 

As shown by experiment, liquid helium can exist in two modifications 
which are called liquid helium I and liquid helium I and which differ 
sharply from each other in physical properties. 

Fig. I11.37 (see §62) shows the phase diagram of helium. It is seen that at 
pressures above 30 atm liquid helium I, which represents the high-temperature 
modification, goes over into the solid state as the temperature decreases. 
However, at pressures below 30 atm helium does not solidify at any temper- 
ature and remains liquid down to T= 0K. On curve II the phase transition 
of helium I into the other modification takes place. The behaviour of the heat 
capacity (fig. 111.46), density and a number of other properties of helium 
depending on the temperature is indicative of this transition. The heat 
capacity undergoes a jump at the transition point; on the density curve a 
break is observed at this point, and so on. Since the latent heat of the phase 





10 
% 8 
uc 
o 
3 
mh 
o 
= 
— 
eo 4 eee 
2 
(0) TELEN CEREN 
1 2 3 4 
T(K) 


Fig. 111.46 





372 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


transition helium I — helium II is zero, this phase transition is a typical phase 
transition of the second kind. 

Liquid helium II possesses a number of remarkable properties which are 
due to its quantum nature. Some of these will be described below *. 

L.D.Landau proposed a statistical theory of liquid helium II based on 
certain assumptions about the character of the energy spectrum of this 
system. New studies on the theory of helium II will be discussed in Part V. 
In these studies the assumptions which underlie the theory of Landau are 
derived from the general propositions of the quantum mechanics of a system 
of particles. 

Consider a certain amount of liquid helium II confined in a container. The 
liquid as a whole represents a quantum system; its possible energy values from 
a certain energy spectrum. At very low temperatures the discrete character 
of the energy spectrum of the liquid cannot be disregarded, in spite of the 
fact that the liquid is a macroscopic system. We have to determine the 
character of the energy spectrum of the macroscopic quantum system for 
very small excitation energies, when the system can only be at energy levels 
close to the ground level which the system is in at absolute zero. 

An accurate calculation of the energy levels of a system consisting of a 
large number of strongly interacting particles appears to be impossible at 
present. This holds equally for liquid helium II, crystals, mutually interacting 
electrons and all other systems of interacting particles. Nevertheless, it is 
possible to establish certain general properties of the energy spectrum of such 
systems for small excitation energies. In particular, such an energy spectrum 
will be possessed by liquid helium II, in which the smallness of the excitation 
energy is ensured by the temperature. 

The basic property of the energy spectrum of any macroscopic system for 
small excitation energies is the fact that the excitation energy can be re- 
solved into a set of independent “elementary excitations”. 

We shall consider for concreteness the energy spectrum of the elastic 
oscillations of a crystal or a liquid for small excitation energies. We have seen 
before that the motion of the atoms of a solid body can be resolved into 
independent elastic waves which do not interact with each other and are 
propagating over the entire volume of the body. The only difference between 
a crystal and a quantum liquid is the fact that in the former both longitudinal 
and transverse waves can propagate, whereas in the liquid only longitudinal 
waves (compression and expansion waves) can exist. Each of these waves 


* See the book which we follow: W.H.Keesom, Helium (Elsevier Publ. Co., Amster- 
dam, 1942, repr. 1959). 


§77 PROPERTIES OF LIQUID HELIUM II 373 


carries a definite invariable energy which can be considered as an elementary 
excitation. The energy of the entire body can be considered as a set of 
elementary excitations, i.e. as the sum of the energies of all independent 
elastic waves propagating in the body. From the above it is clear that an 
elementary excitation corresponds to an excitation energy of the body as a 
whole and can in no way be referred to an individual atom with an excess 
energy over other atoms in the body. Each of the elementary excitations 
representing a sound wave moves in the body, undergoes reflections from its 
walls, moves in a new direction and so on. An elementary excitation 
possesses a definite momentum in addition to an energy. 

The motion of all elementary excitations in a body can be likened to the 
motion of non-interacting quasi-particles, that is excitation quanta, which 
form an ideal gas inside the body. A complete analogy can be drawn between 
light waves and light quanta on the one hand, and between elastic waves and 
excitation quanta in a crystal on the other hand. Just as the light field can be 
treated as a set of light quanta (photons), the field of elastic waves which fill 
a crystal can be replaced by a gas of excitation quanta which are often 
called phonons. 

However, an essential reservation should be made. This analogy, which is 
very convenient for carrying out a number of calculations, has only a formal 
character. Sound quanta have no direct physical reality and serve only for a 
mathematical expression of the properties of a discrete set of elastic waves in 
a crystal. Keeping in mind this reservation, we shall assume in what follows 
that the excitation energy of a body represents the energy of excitation 
quanta which fill its entire volume like an ideal gas. The energy of an excita- 
tion quantum e has, in the general case, an unknown functional relationship 
with its momentum p. 

We shall now consider the energy spectrum for the case of liquid helium * 
in more detail. The basic properties of liquid helium can be derived from 
certain simple assumptions about the form of the spectrum. That is, it 
should be assumed that in helium II there are two forms of excitation quanta: 
long-wave and short-wave ones. The first, which have a large wavelength A, 
carry a small momentum p= h/A and a small energy e(p). For small p the 
function e(p) can be expanded in a series in powers of p, and one can write 


€ const: p. (77.1) 


* We shall return to the problem of the energy spectrum of helium II in Part V. The 
problem of collective excitations will be discussed in more detail in Part VI. 





374 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


Long-wave excitations in liquid helium II represent elastic longitudinal 
compression and expansion waves. Therefore the constant in (77.1) is simply 
the velocity c of propagation of sound waves. Thus, for long-wave quanta one 
can write 


€=cp. (77.2) 


In hydrodynamics it turns out that, when sound waves of a small amplitude 
arise in a liquid, then the liquid comes into a state of vortex-free (potential) 
motion. However, in the general case the flow of a non-ideal (viscous) 
liquid is a vortex motion. Hence in addition to longitudinal sound waves in 
liquid helium II there must exist other elementary excitations. We shall 
assume that besides long-wave sound excitation quanta there exist also in 
helium II short-wave excitation quanta whose wavelength is close to a certain 
wavelength Ag. The corresponding momentum of the short-wave quanta is 
close to pg =h/Np. We assume that the energy of quanta with momentum po 
has a minimum value in comparison with the energy of all quanta whose 
momenta are close to pọ- In other words, we assume that the energy of 
elementary excitations has the form of the curve shown in fig. IlI.47. It can 
then be said that in the liquid besides long-wave excitation quanta whose 





rob 














fi 
1 Peto? 
p/h (cm7") 


Fig. III.47 


§78 STATISTICAL THEORY OF LIQUID HELIUM II 375 


momentum is close to zero there will also exist quanta with a momentum 
p ~ po- The energy of such quanta can be written in the form 


(p—po)” 
ee) (77.3) 


where elpo) and u are constants whose values must be determined experi- 
mentally. In expansion (77.3) in powers of p — Po the term proportional to 
the first power of this difference is absent, because, by the assumption, e(p) 
has a minimum at the point P=Po- The constant in the second term is 
denoted by uy, to stress that the energy of short-wave quanta formally looks 
the same as the energy of ordinary particles. 

Of course, there are no grounds for assuming beforehand that in the excita- 
tion spectrum of the quantum liquid there exist mainly just quanta of the 
two types quoted. However, the introduction of such a spectrum is justified 
by the fact that by means of it, it turns out to be possible to give a quanti- 
tative explanation of all the peculiar phenomena which occur in helium II. 

At the same time it should be kept in mind that in addition to short-wave 
and long-wave excitation quanta in the liquid there are also quanta of inter- 
mediate wavelengths, but that the number of such quanta is relatively small. 
As we have just stressed, excitation quanta move over the entire volume of 
the body without interacting with each other (for small excitations), like the 
particles of an ideal gas filling the volume of the body. Long-wave excitation 
quanta can be likened to photons, while short-wave quanta behave as ordinary 
particles of an ideal gas which have a mass y. 

To avoid any misunderstanding, we stress once more that this analogy has 
only a mathematical nature. In reality, each excitation represents a particular 
form of motion of all the atoms of the liquid. Hence a short-wave excitation 
cannot be represented as a real particle moving in the liquid. However, the 
mathematical analogy between a set of excitation quanta and an ideal gas 
allows one to find the thermodynamic functions of liquid helium easily. 


§78. Statistical theory of liquid helium II 


The presence of thermal excitation quanta in liquid helium II means, from 
the macroscopic point of view, that it has a free energy F which can be 
assumed to be made up of the free energy due to the existence of long-wave 
excitations and the free energy due to the existence of short-wave excitations: 





376 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 
F= Fix + Foy - (78.1) 


We shall write an expression for each of the terms separately. 

The free energy of long-wave quanta Fįw can immediately be written in 
analogy with the free energy of a solid body at low temperature, taking into 
account that now only longitudinal waves can exist, whereas transverse waves 
are absent. Thus, 

4n Vv? ax 
pp Do (78.2) 


3N s 
3c3 


where N is the number of atoms of the liquid in a volume V, Vmax is the 
maximum frequency of the sound waves, and c is the velocity of sound. 

Hence, substituting the value 0, =/v,,,,/k into (53.7) and taking into 
account (78.2), we find for Fiw 


4 W(KT)AV 

Fiw =— 1S 3 (78.3) 

The calculation of F, w is somewhat more complicated. Short-wave quanta 
behave like the particles of an ideal gas. Their energy, which is determined by 
formula (77.3) can be assumed to be large in comparison with kT at suffi- 
ciently low temperatures where it is still possible to speak of independent 
elementary excitations. For this the following inequality must in any case be 
fulfilled: 


elpo) > kT. 


We shall see below that this inequality is indeed fulfilled in liquid helium II. 
Hence the short-wave quantum distribution function has the form of the 
classical Boltzmann distribution. The free energy of a classical ideal gas 
(taking into account the identity of particles) has the form 





eV dy 
PP, SNe REI ) e-/kT a 78.4 
ye Ney Maal |e (78.4) 


where dy=dp,dp,dp, and Nw is the number of short-wave excitation 
quanta. The value of N w is, however, not a fixed quantity, but depends on 
the temperature of the liquid. It increases with increasing excitation, i.e. 


§78 STATISTICAL THEORY OF LIQUID HELIUM II 377 


with increasing temperature of the liquid *. For a given temperature the 
number of short-wave excitation quanta is determined from the condition 
of the minimum free energy: 

OF Sw 


ON. 


S.W 





=O). (78.5) 


Substituting (78.4) into the condition (78.5), we find for the number of 
short-wave quanta the expression 


„r d 
New =V f ler (78.6) 
1 


Substituting N, w into (78.4), we find for the free energy 


F, 


S.W 


ae -e/kT AY 
KTV fe at (78.7) 


We calculate the integral contained in (78.7). Obviously, we have 


2 
-e/kT dy _ Elpo) _@-Po) pee 7 
fe Sein fol E SSA a a 


elpo) (P-po)? p?dp 
= 4r exp (- kT ) Sexe (- DukT lens 5 


The range of integration with respect to the momentum of the excitation 
quanta is not accurately determined. Since, however, the integrand rapidly 
decreases with increasing difference p -— po, and for (P-po)?/2u > kT 
practically reduces to zero, the range of integration can be extended to +o. 
Then we have 


oo 2 
-ekr dY - _P Po)") pĉdp 
fe B 4r exp ( kT df exp DekT m3 zi 








* The dependence of Ng yw on the temperature allows us once again to convince 
ourselves that the treatment of elementary excitations as quasi-particles has a relative 
character. 





i 
i 
: 


378 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 
Since the integrand reduces to zero when (p—Po)?/2u> kT, the slowly 


varying function p? can be brought out of the integral sign and taken at the 
point p = po. Then 


€(o)\ r (P-Po)*) 4 
Sek = ge? (- 0 ) _@=Po)") dp _ 
fe iB 47po exp ET ie exp kT ) p3 


_ 41} (2ukT)? A. (222), 

















n3 kT 8:8) 
Thus, we obtain finally 
4m (2mu)? (KT)? VPR y elpo) 
cw A exp ( IT ). (78.9) 
4np2V (2mukT)2 (Po) 
y= og —— exp (5) (78.10) 


Substituting the values of F,, and F,,, from (78.3) and (78.9) into 
(78.1), we find the expression for the free energy of liquid helium II: 








1 3 
4 cErsy — 4% mu)? (kT)? Vog A: (78.11) 


45  p3c3 n3 ZINT 
Correspondingly, the entropy and heat capacity of helium Il are equal to 


_16nsksr3y 47 nw)! pekaT2V BeBe ox ( =) 








(5. pee n3 UOTE EENE TET (78.12) 
c, = 1S PM TIV 
AG Sey 
4n (27u)? 24372 2 
a ial 3 eee |exe( | (78.13) 
h3 4 2kT 2T? KIE | i 


We see that all the thermodynamic quantities are made up of two parts 
which are due respectively to the long-wave and the short-wave excitations. 
The first part varies with temperature according to the same power law as in 
the case of crystals. The second part depends on the temperature exponen- 
tially, i.e. proportional to exp[—e(po)/kT]. The values of the constants were 


§78 STATISTICAL THEORY OF LIQUID HELIUM II 379 


determined from the measurements of the entropy and heat capacity of 
helium II, and turned out to be equal to 


e(po)/k =96K, Po/h = 12.25 X 108 cm7! , u = 0.75 mye - 


For these values of the constants the power part of the heat capacity and 
entropy exceeds the exponential part at temperatures which are lower than 
about 1 K. On the contrary, at high temperatures the exponential (short- 
wave) part predominates. The temperature trend of the thermodynamic 
quantities is in complete agreement with experiment. 

The most outstanding property of liquid helium I, which was discovered 
by P.L.Kapitza, is its “superfluidity”. Namely, measurements of the viscosity 
of helium II flowing through fine slits and capillary tubes showed the viscosity 
to be negligibly small. Owing to this helium II flows practically without 
obstruction through the finest capillary tubes. A consequence of the super- 
fluidity of helium II is its extraordinarily high thermal conductivity (“the 
thermal superconductivity”’), which was discovered experimentally before the 
superfluidity. Owing to the negligibly small viscosity, characteristic convec- 
tion streams arise in helium II, which make it possible to transfer considerable 
amounts of heat under conditions in which an ordinary viscous liquid devoid 
of convection agitation has a negligibly small thermal conductivity. The 
phenomenon of superfluidity finds a complete explanation in the theory 
given above. It turns out to be closely associated with the character of the 
energy spectrum of helium II. 

Let us consider the flow of helium II along a solid wall. For convenience 
of the treatment we shall transform to the reference frame in which the 
helium is at rest and the solid wall is moving. From the point of view of 
excitation quanta any process of energy dissipation due to the viscosity can 
be considered in the following way. In the reference frame which we have 
chosen the energy of the helium is initially given and is determined by the 
number of elementary thermal excitations. The interaction between the wall 
and the liquid leads to the appearance of an additional internal motion in the 
liquid layer adjacent to the wall. This internal motion represents the thermal 
motion of the particles of the liquid. Thus, the energy dissipation consists 
in the appearance of excitation quanta (thermal motion) in the liquid. We 
shall assume in the beginning that in the helium II there were initially no 
excitation quanta, i.e. that its temperature T was equal to zero. Let an 
excitation quantum with a momentum p and an energy e(p) arise in the 
helium. Then the internal energy of the helium will become equal to e(p). In 
the reference frame in which the helium flows while the wall is at rest, its 





380 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


energy, according to the rules for the transformation of energy in a relative 
motion, is equal to 


E=4mv2 + ep) + pv, (78.14) 


where v is the velocity of flow, mv? is the kinetic energy of the liquid, and 
e(p) + p-v is the change in its energy. 

When energy is dissipated the kinetic energy of the flowing liquid can only 
decrease, i.e. € + p-v <0. The smallest value of the quantity e + p-v is reached 
when a quantum with momentum p directed antiparallel to v arises. It is 
equal to e — pu. Consequently, the inequality 


e—pvu<0 
or 
v> ejp . (78.15) 


must be fulfilled. This means that, if e/p #0, excitation quanta can arise in 
the flowing helium, and an energy dissipation can occur only for a sufficiently 
large velocity of flow. For a velocity of flow which does not satisfy the ine- 
quality (78.15) no interaction accompanied by the appearance of thermal 
excitation quanta can arise between the wall and the helium. 

From the form of the energy spectrum of helium II shown in fig. II1.47 
(§77) it is clear that the quantity e/p for helium II is always different from 
zero. Thus, at the temperature of absolute zero liquid helium II moves along 
a solid wall without any interaction and energy dissipation, provided its 
velocity of motion does not exceed vo =(€/P)min, Where (€/P)min is the 
minimum value of the ratio e/p. It is in this that the phenomenon of 
superfluidity consists. 

For T#0 all previous reasoning remains valid, and no new excitation 
quanta can arise in helium II when v#v 9. However, the already existing 
thermal excitation quanta can interact with a solid wall. 

It turns out that in helium II at T#0 two forms of motion are possible, 
superfluid motion and normal motion, which can occur in the same portion 
of the liquid simultaneously and independently of each other. The superfluid 
motion is without viscosity and is not accompanied by any transfer of thermal 
excitation energy. The normal flow proceeds in the same way as an ordinary 
flow ofa liquid with a viscosity different from zero. With each of the forms of 
motion is associated the transfer of a part of the mass of the helium. Because 


§79 ELECTRON GAS AT ABSOLUTE ZERO 381 


of this helium II can, in an obvious although not strictly accurate way, be 
considered as a mixture of two liquids: a superfluid liquid and a normal liquid. 
The motion of the superfluid liquid, which carries a part of the helium II at 
T #0, is the same as the motion of the entire helium II at T= 0. However, 
at [#0 a part of the mass of the helium is in the normal state, flows with 
friction and carries heat. The properties of the superfluid part of the liquid 
are shown in experiments on the flow of helium through fine capillary tubes. 
It flows through the finest capillary tube unimpeded. In experiments on the 
motion of bodies, for example in experiments on the oscillations of a disc 
immersed in a container of liquid helium, an interaction with the normal part 
of the helium is observed. In this case the motion of the disc is the same as in 
a normal liquid possessing viscosity. However, the mass of normal liquid turns 
out to depend on the temperature. As T> 0 the mass of the normal part of 
helium II also decreases to zero. 

One of the remarkable thermal properties of helium II is the so-called 
thermomechanical effect. This effect consists of the fact that, as helium flows 
out of a container through a very fine capillary tube, the temperature of the 
helium which remains in the container increases. On the contrary, as helium 
flows into a container the temperature in the container decreases. 

The origin of the thermomechanical effect is understandable from the 
above. The superfluid part of the helium which moves through the fine 
capillary tube carries no thermal energy. When a certain amount of the 
superfluid helium flows out of the container the thermal energy content 
which existed before is distributed through the remaining amount and its 
temperature increases. As helium flows into the container the reverse 
phenomenon occurs: the thermal energy content which existed initially in 
the helium in the container is distributed over the entire amount of helium. 
The magnitude of the effect increases with decreasing temperature. This 
allows one to make use of the thermomechanical effect in helium to obtain 
very low temperatures. 


§79. The electron gas at absolute zero 


We shall now consider the behaviour of a Fermi system, an electron gas, at 
low temperatures. In expounding the quantum theory of metals in Part VI 
we shall show that in a certain approximation the behaviour of the whole set 
of electrons in metals can be described as the behaviour of an ideal degenerate 
Fermi gas. Hence the properties of the Fermi gas are of very great interest. 

We shall first of all discuss the behaviour of an electron gas at absolute 


382 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


zero. For this we write the Fermi distribution which, according to (72.11), 
has the form 


1 


2m 3 e?de Edy: 
dn = 2X 27 Sit 79.1 
: A exp[(e—W/kT] #1" 7,3 sie: 





The factor 2 is introduced in order to take into account the fact that to each 
energy level there correspond two states in which the electrons can have 
spins with opposite orientations. 

We pass in (79.1) to the limit 7> 0. Then the Fermi distribution shown 
(for TAO) in fig. 111.48 assumes the form shown in fig. III.49 and is expressed 
by the formulae 


1 €<yu(T=0) = €nax 


T= (79.2) 
0 €>u(T=0) = Emax 


where u(T=0) denotes the chemical potential at absolute zero. We call this 
quantity the maximum energy at absolute zero, €,,,,. This result has a simple 






















































































W| 
U 
++—---J- 
tt | 
ran t- 
+— Emax 
++ Z | 
ppa MM 
Fig. 111.48 
n 
2 
1 
i 
) a z 





Fig. II1.49 


§79 ELECTRON GAS AT ABSOLUTE ZERO 383 


meaning: the energy levels of a system of a large number of electrons moving 
freely in a finite volume V, which is bounded by an impenetrable energy 
barrier (the walls), form an almost continuous spectrum. According to the 
exclusion principle, no more than two electrons with opposite spin orienta- 
tions can simultaneously be in each energy level. Two electrons in a metal 
will occupy the lowest energy level with an energy equal to zero. The third 
and fourth electrons must be at the first excited level. Following electrons are 
distributed at higher energy levels, each successive electron pair occupying the 
corresponding level. If the total number of electrons in the system is equal to 
N, then the first 4M states with energies 0 < e < Emax Will be occupied at 
absolute zero. All other states with € > €,,,, Will be free of electrons. This is 
schematically shown in fig. II1.48. 

Fig. III.49 shows the state distribution function for electrons at T= 0. 
The number of electrons in each occupied state is equal to two, and in 
unoccupied states it is equal to zero. It is obvious that, because of the ex- 
clusion principle, electrons must get into excited energy levels even at 
absolute zero. 

Let us find the energy Emax of the highest of the occupied energy states of 
electrons at absolute zero. At each energy level there are two electrons, so 
that for the total number of occupied levels we have 


2 3 €max 1 
an (=)? v f ede=N, 
2 





0 
whence 
3 
n= (ma) -Eu 3 
3 h2 E 
or 


_ h2 [3N 
Emax ~ 2m (2 2) š 22) 


The energy of all the electrons at absolute zero is obviously equal to 


Emax 


2m\3 3 
E) = 4nV (yy f €2de = ZNE max - (19.4) 
0 





384 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


The mean energy of an individual electron in an electron gas at T = 0 amounts 
to 2 of the maximum energy ema 

The substitution of numerical values for the quantities contained in (79.3) 
gives, for example, for V/V ~ 1019: Emax = > CV OF Enjax = 6:0 X 104 degrees. 
One can also find the maximum velocity of electrons at absolute zero: 


1 
Vmax = Pmax/™ = (2€max/m)2 = 1.39 X 108 cm/sec . (79.3') 


The velocities of electrons turn out to be very large even at absolute zero. 
We see that the properties of an electron gas differ radically from those of 


a classical atomic gas. 

The energy of an electron gas turns out to be proportional to the number 
of electrons N raised to the power 3 and to. the volume of the electron gas 
raised to the power 2. Electrons at absolute zero are not at rest, as is to be 
expected from classical concepts, but are moving with different velocities. 
The mean velocity of this motion is very large. In spite of this, the heat 
capacity of the electron gas at absolute zero turns out to be exactly equal to 


zero. Indeed, 


z (22) p 
Y Timo 
since the energy of the gas does not depend on the temperature. 


Electrons, as any other gas, exert a pressure on the walls of the container. 
For the gas pressure one can immediately write the expression (8.2): 


Nmv2 
v? p(v,)dv, = BV (79.5) 





_2mN f 
ey 


since all directions of motion of electrons are equivalent and v? = įv2. For v2 
it is necessary to substitute the expression 


V? = 2Ey/mN , (19.6) 
where E/N is the mean energy per electron. Then we obtain 
p=3E/V. (79.7) 


The same result can also be obtained by a thermodynamic method. The sub- 
stitution of numerical values gives for N/V~10!9: p=2X 105 atm. 


§80 ELECTRON GAS AT LOW TEMPERATURES 385 


In the foregoing reasoning we have tacitly assumed that the electron gas is 
an ideal Fermi gas for which the interaction between particles can completely 
be disregarded. The validity of such an assumption may appear to be very 
questionable, particularly if it is taken into account that the density of the 
degenerate gas is very large and that it consists of charged particles for which 
the interaction decreases slowly with increasing distance. 

The assumption that the gas is ideal is satisfied if the mean kinetic energy 
of an electron is very large in comparison with the mean energy of its inter- 
action with other particles. 

The mean kinetic energy of an electron is given by formula (79.4). The 
energy of interaction of two electrons is in order of magnitude equal to 
e2/F, where F is the mean distance between them. If V/V is the number of 
electrons and ions per unit volume, then the mean distance Fis in order of 
magnitude equal to 


wie 


F~ (VIN) 


The criterion of the smallness of the interaction energy can be written in the 
form 





2 
e 1 < Emax 
OVINE 
2 
Since Emax ~ (M/V)3, then, upon performing simple transformations, we 
obtain 
DNS 
Ny (: m) r 
V h2 


For a large density of the electron gas the ratio of the interaction energy to 
the kinetic energy can turn out to be small. 

Since the kinetic energy of the gas increases with increasing density more 
rapidly than the potential energy, we arrive at the following, at first sight 
paradoxical, result: for an electron gas to be considered as an ideal gas its 
density must be sufficiently great. 


§80. The electron gas at low temperatures 


We shall now consider the properties of an electron gas at temperatures 
which differ from zero but are still comparatively low. Namely, we shall 


- =e 


aj 


386 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


assume that the temperature of the metal is such that kT is considerably 
smaller than the maximum energy of electrons, Emax» at absolute zero. In this 
case the thermal excitation of the electron gas will be relatively negligible. 
This means that thermal excitation can bring electrons from energy levels 
occupied at T=0 only into nearby higher energy levels. It is clear, for 
example, that thermal excitation is insufficient to raise an electron from an 
energy level € X Emax to an energy level € > Emax- It is only sufficient for the 
excitation of electrons which are at energy levels lying in a narrow interval 
of the order of magnitude of KT. Some of the electrons from these levels turn 
out to be raised to levels higher than the level € = Emax but above it by an 
amount which does not exceed kT. 

Fig. 111.50 shows schematically the thermal excitation at low temperature. 
Some of the levels lying below Emax turn out to be unoccupied for most of 
the time. Single electrons appear in levels lying above Emax- The state distri- 
bution function of electrons changes. At T = 0 it is represented by the broken 
curve (fig. 11.49) while at low temperatures above zero it assumes the form 
shown in fig. II.51. The distribution at 7=0 is shown in fig. III.51 by a 
broken line. The fall of the curve for € < Emax means that the mean number 
of electrons in the corresponding levels turns out to be less than unity; 
electrons go into levels lying above Emax- The energy region in which the 
mean number of electrons at each level turns out to be smaller than unity 
but larger than zero is called the zone of diffuseness of the distribution func- 
tion. From fig. III.51 and the above it is clear that the width of the zone of 
diffuseness is in order of magnitude equal to kT. The number of electrons 
appearing at levels lying above Emax is very small in comparison with the total 
number of electrons in the metal. Similarly the number of single electrons 
which are at various energy levels represents a small fraction of the total 



















































































ea 
w 
ae 
Ý U 
we = "i 
PL 
i Emax 
7 4 
i4 
Zp ry 
Fig. 111.50 


§80 ELECTRON GAS AT LOW TEMPERATURES 387 





Fig. HLS1 


number of electrons. The condition of degeneracy KT < Emax is the same as 
that for a Fermi gas. 

We now pass on to the discussion of the quantitative theory of an electron 
gas at a low temperature. 

We confine ourselves to the calculation of two thermodynamic quantities, 
the chemical potential u and the mean energy Æ, as functions of the temper- 
ature. The chemical potential of the electron gas is usually called the Fermi 
level or the Fermi surface. The origin of this terminology will be clear from 
what follows. 


To determine the chemical potential we make use of the normalization 
condition. Namely, on the basis of (79.1) we can write 


co 1 
2m\3 e2de 
= 2 — | -— e 

N 2x an (7%) f De Wen (80.1) 


The mean energy of an electron gas is given by the formula 
3 


seu 2m 3 s$ d e2 de 3 
E= an (27) a SNET (80.2) 
The integrals contained in formulae (80.1) and (80.2) are not taken in the 


general form. To calculate them at low temperatures we make use of the 
following method. Let 


f= [exe (ea + |" (80.3) 


and 





388 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


co 





I= f fede, n>0. 
0 
Integrating by parts, we obtain 
= etl i l 3 n+l of 
I=f—— =e ent L de. (80.4) 





When the limits are substituted the first term obviously reduces to zero. The 
function df/de is shown in fig. III.52. We see that it is an even function of its 
argument and has such a sharp maximum for e =u that one can consider it as 
one of the representations of the 5-function. We introduce a new variable 





-EH 
x kT z 
Then 
Inoa af 
[=- fo Grete! T ax. 
Nihil -u/kT 2 


Since we are considering the low temperature region, the lower limit can be 
replaced by minus infinity: 





In that region of variation of x in which ðf/ðx has a value differing from zero, 
i.e. for e ~ u, the value of x is very small. Hence the first factor in the inte- 





Fig. 111.52 


§80 ELECTRON GAS AT LOW TEMPERATURES 389 
grand can be expanded in a series and one can restrict oneself to the first 
terms of the expansion. For large x, when this cannot be done, the integrand 


reduces to zero owing to the factor ðf/ðx which is negligibly small every- 
where except for x = 0. Thus, 


= l = n+l kTx 
[= wet d E A kB or 


2 
, tn i x2 +. Lar. (80.5) 





2 


Since ðf/ðx is an even function, we have 


f =£ a=o, o f oteo. 


Hence 


eg eap Da poe Gas (eB? * 2f 
7 rR dese 7 T f= xt]: 


(80.6) 
It is easily seen that 


J? Lae =f 


The second integral in (80.6) is calculated in Appendix IV. It is equal to 
—47n2. Finally we have 


waif Sok (n+1)n 2 i 
ma Aes A (80.7) 


Sai 








The discarded terms are proportional to higher powers of the ratio k7/u. 
Coming back to the normalization condition in which n=4, we have 


co 1 
2m\5 e2de 
N= 4n V —— ra 
AET J exp[(e—u)/kT] + | 


2m\3 2 | n2 (y 
ted (Ae yin Me ES 80.8 
n (2) ane hS (80.8) 





390 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


For T= 0 the second term in (80.8) reduces to zero, and 


3 
T 5 2 
E ERT (80.9) 

2 


Comparing (80.9) with (79.3), we find that 


HT=0 ~ Ho = Emax - (80.10) 
The chemical potential of the electron gas at absolute zero turns out to be 
equal to the maximum energy of an electron at absolute zero. 

For temperatures close to absolute zero, (80.8) can be solved for u by a 
method of successive approximations. In the small second term of (80.8) u 
can be replaced by Ug = Emax- Then we have 


2 2 
u=Ho [1-75]. (80.11) 


2 


Ko 

Analogously, for the mean energy of the electron gas (n=3) we find 
2m\ž 7, 8 1 

pean (ay V (2u2 + 4n2p2(k7)2] - (80.12) 


h? 


Substituting the value of u from (80.11) into (80.12) and disregarding the 
higher powers of the ratio kT/u, we obtain 


? STA ( kT \? 
E= {N€ nax pe ) i (80.13) 


€max 





From (80.9) or (80.10) it can be seen that the conditions of degeneracy 
kT <e,,,, and (72.16) are completely identical. Hence the electron gas is 
degenerate at temperatures T<e,,,,/k. We have pointed out that for 
N/V ~ 1019 the maximum energy Emax ©Xpressed in units of temperature 
is equal to 6X 109. 

From (80.13) one can find the heat capacity of the electron gas: 


Nkr? kT 
Ya IDE Ne 








(80.14) 


max 


The heat capacity of the electron gas turns out to be a linear function of the 


§80 ELECTRON GAS AT LOW TEMPERATURES 391 


temperature, and reduces to zero at T=0. The factor of proportionality 
contains only known quantities: universal constants and the density of the 
electron gas N/V. For simple metals, for example, copper and silver, whose 
atoms have one weakly bound valence electron, the number of free electrons 
per atom can be considered to be equal to unity. Then, for example, for 
copper the theoretical expression for the electron heat capacity has the form 


Cy = 0.9 X 10-4 NKT . (80.15) 


Thus, the heat capacity of the electron gas, in agreement with experimental 
data, is very small and at normal temperature represents an immeasurably 
small fraction of the heat capacity of the crystal lattice. 

The heat capacity of a lattice at temperatures which are lower than the 
characteristic temperature rapidly decreases and tends to zero as the cube of 
the temperature when T> 0. The heat capacity of an electron gas also tends 
to zero as T > 0, but much more slowly, as the first power of the temperature. 
The ratio of the heat capacity of the electron gas to that of the lattice is equal 
to 


For copper Emax = 5 eV and the characteristic temperature is 0 = 335 K, so 
that 


l 
Sy whereemnt i 
Ca TOAN 


The ratio between the heat capacities is of the order of unity at T= 3.3 K. At 
still lower temperatures the heat capacity of the electrons turns out to be 
larger than the heat capacity of the lattice. 

Accurate measurements of the heat capacity confirmed the validity of 
the theoretical formulae. Thus for copper, for which the measurements are 
particularly accurate, the heat capacity at low temperatures turned out to be 
made up of two terms, one of which agreed very accurately with the theo- 
retical formula for the heat capacity of the lattice. The second term agrees 
with an accuracy to within 2% with the theoretical formula (80.14). At 
temperatures which are lower than 3 K the electron heat capacity is larger 
than the heat capacity of the lattice and is superposed on the theoretical 
curve. 





392 STATISTICAL DISTRIBUTIONS IN QUANTUM STATISTICS Ch. 10 


In conclusion we shall calculate the number of electrons in the zone of 
diffuseness of the Fermi distribution, which it is convenient to do for an 
obvious interpretation of a number of formulae in what follows. 

The number of electrons in the zone of diffuseness, or the number of 
unpaired electrons, will also be called by us the number of effective electrons 
neff- It is these electrons which can change their state under the effect of 
external actions. Hence the heat capacity of the electron gas, its electric 
conductivity and so on depend on effective electrons. 

The value of gg can be found from the following. The probability for an 
electron to be found in a given state with an energy € is proportional to the 
value of the distribution function f. The probability that this state is not 
occupied is equal to 1 — f. Since only electrons with antiparallel spins can be 
in one state, the product of the probabilities f(1—f) represents the proba- 
bility that in the state with energy e there is one electron, while the second 
electron with the antiparallel spin is not in it. In other words, f(1—/) repre- 
sents the probability that the state with energy e will be occupied by only 
one electron. The total number of such states or, what is the same, the 
total number of unpaired electrons is equal to the integral of the product of 
f(i—f) and the number of states dQ with given energy, the integral being 
taken over all values of the energy: 


oo 3 co 
ne =2f sapa AI" S fr—petac (80.16) 
0 4 0 


In a strongly degenerate gas u > KT, so that 





-1 
2 e-u a ay €—u 
fi [exp ET + J Il 1—f exp iT 


For e — u > KT the integrand decreases exponentially. Hence, instead of inte- 
grating up to infinitely large values of the energy, one can confine oneself to 
an integration up to a value e ~'u. Then 


3 H 
4n(2m)2 f e—u\ 1 
i SS exp —~] e2de . 
eff ne Xp kT 
0 

Since the exponential factor rapidly decreases, one can bring the slowly 
varying factor €? out of the sign of the integral, taking its value at the upper 
limit. This gives 


§80 ELECTRON GAS AT LOW TEMPERATURES 393 


H u 

GL ees oE saree | Wey NV 

f (exe =H de ~ u? jh exp -7 de = kTu? (: exp tn) = 
0 





1 1 
= kTu? ~ kTue 
since u > kT, whence 
3 1 
4n(2m)2kTu2, 
ie! $= (80.17) 
h? 


By means of formulae (79.3) and (80.10) “gp can be written in the form 


3NKT 
Nore = Wie . (80.18) 


max 


Thus, “gp represents (with an accuracy to within the numerical coefficient) a 
small fraction of the total number of electrons per cm? which is equal 
approximately to kT/€max- 

By means of ‘gg the heat capacity of the electron gas in order of mag- 
nitude can be written in the form 


CH ~ ikna» (80.19) 


where 3k is the classical value of the heat capacity. 

Formula (80.19) shows that the properties of an electron gas can be 
approximately interpreted in the following way. In the gas there are Mey 
effective particles which can change their state and absorb energy supplied 
from without. Every one of these, particles possesses ordinary classical 
properties and, in particular, to every one of these there corresponds the 
usual value of the heat capacity. 





PART IV 


ELECTROMAGNETIC PROCESSES 
IN MATTER 








Electromagnetic Fields in Matter 


§1. The derivation of the basic field equations 


We have already considered electromagnetic processes occurring in a 
vacuum. The motion of electric charges in a vacuum and the electromagnetic 
field accompanying them were considered. We now pass on to the study of 
electromagnetic phenomena in matter (i.e. in a medium). The theory of 
electromagnetic processes in matter is often called macroscopic electro- 
dynamics. 

The character of electromagnetic processes in matter depends essentially 
on the properties of the matter. For example, the mechanisms of the 
passage of current through metal conductors and gaseous plasmas are essen- 
tially different in character and are accompanied by different phenomena; 
magnetic phenomena in ferromagnetics differ strongly from those in diamag- 
netics and paramagnetics, and so on. Nevertheless, it turns out to be possible 
to construct a phenomenological theory of electromagnetic phenomena in 
matter on the basis of certain very general assumptions. It is necessary, first 
of all, to find the general equations of the electromagnetic field in matter. 
For this, as we shall see below, it will be necessary to make certain very 
general assumptions about the concrete properties of the medium in which 
one or other electromagnetic process takes place. 


397 





398 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


We have seen that the basic equations of the theory of the electromagnetic 
field, the Maxwell—Lorentz equations, contain quantities referring to a given 
point and a given instant of time. We shall write these equations in a slightly 
different notation, replacing E by e and H by h. We write the Maxwell— 
Lorentz equations as well as the charge conservation law in this new notation: 


yxe=-- 1%, (1.1) 

V-h =0, (1.2) 
_4n 1 ðe 

y X h= mee Bare (1.3) 

ye =47p, (1.4) 

Tope Pa. (1.5) 


ðt 


In matter, a medium consisting of atoms or molecules, the Maxwell— 
Lorentz equations characterizing the field at a given point and a given instant 
of time lose their meaning. Indeed, in matter all quantities, including also 
the electromagnetic field, vary very rapidly from point to point and at a 
given point also vary rapidly in time. For example, the electric field strength 
has a relatively small value outside any given atom, becomes very large inside 
the atom, and again decreases outside it. The increase of the field by a factor 
of several million and its subsequent decrease take place on a scale of the 
order of magnitude of atomic dimensions. A change in the field with time at 
a fixed point occurs, for example, because of the thermal motion of the 
atom in small fractions of a second. Hence, as in other macroscopic processes 
taking place in matter, only the mean values of the corresponding quantities 
are of interest and significant (see Part III). 

Let us average the Maxwell—Lorentz equations over a volume vo large 
compared with atomic dimensions but macroscopically small and a time 
interval 7, introducing the mean values according to the formula 


alow fist; (1.6) 


este 


§1 DERIVATION OF BASIC FIELD EQUATIONS 399 


We then have 


Sae 
VX ec Eo (1.7) 
V -h=0O, (1.8) 
ee: 
VAh oN i (1.9) 
V-E =4rp, (1.10) 
Rope 
Y PY tar : (1.11) 


e=E, (1.12) 
h=B. (1.13) 


The field strength E denotes the mean value of the electric field strength in 
the medium, and the magnetic induction B the mean value of the magnetic 
field strength in the medium (this name for the mean magnetic field is asso- 
ciated solely with historical tradition). Then eqs. (1.7) — (1.11) assume the 
form 


~  LOs 
VX E=--=, (1.14) 
V-B =0, (1.15) 
aB 4m — 
MR Sees el OM (1.16) 
V-E =4np, (1.17) 


V- pvt P= y 
iO (1.18) 


sr 
For 


) 





400 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


Further transformations of the equations are associated with the calculation 
of the mean values p and pv. To find these mean values it is necessary to 
introduce certain assumptions about the structure of matter. 

In the theory of the electromagnetic field in matter, the latter is considered 
as a continuous medium whose properties are described by means of a number 
of formal macroscopic characteristics, such as the dielectric constant, con- 
ductivity and so on. Some of these formal characteristics can be found by 
applying the methods of statistical physics and are connected with the mole- 
cular structure of matter. 

However, we shall, from the very beginning, have to divide all matter into 
two groups; conductors and dielectrics. By conductors we shall understand 
bodies in which, under the action of an external static field, a displacement of 
charges occurs over the volume and an electric current arises corresponding 
to this motion of charges. In dielectrics an external field does not give rise to 
the motion of charges, although it can cause their displacement to new 
equilibrium positions. 

It is clear from these definitions that the division of bodies into conductors 
and dielectrics is somewhat arbitrary. In practice, the external electric field 
gives rise to a finite, although very small, current in dielectrics. On the other 
hand, in some conductors also the current can be small. Semiconductors play 
a most important role in physics and technology. Under certain conditions 
the electric current in semiconductors is relatively large and approaches that 
in conductors. At other times the current in semiconductors is as small as in 
the best dielectrics. 

Nevertheless, the schematic separation of all bodies into conductors and 
dielectrics appears to be a sufficiently good approximation on the basis of 
which it is possible to construct a phenomenological theory of electromag- 
netic phenomena in continuous media. 


§2. The polarization of a medium in an electric field 


In calculating p one has to distinguish between the cases of a body which 
is as a whole electrically neutral and one having a charge different from zero. 

Let us consider the first case. If an electrically neutral body is placed 
in an external electric field, a displacement with respect to each other of 
the positive and negative charges occurs in the atoms and molecules. The 
body, remaining electrically neutral, acquires a dipole moment. We shall 
characterize the body by the total dipole moment of all its particles in unit 
volume. The total dipole moment per unit volume will be called the polariza- 


§2 POLARIZATION OF A MEDIUM IN AN ELECTRIC FIELD 401 


tion vector or, briefly, polarization P. The dipole moment elle by the 
body is by definition equal to 


d= f Pav = f Prouna t AV. (2.1) 


In view of the arbitrariness of the volume of integration 
E= pound! = (2.1') 


If the body is uniformly polarized, i.e. if P is the same at all points of the 
body, then P=d/V. The appearance of polarization in a body can, under 
certain conditions which will be clear from what follows, be accompanied by 
the appearance of a mean charge density Pyoung- The subscript “bound” 
means that the mean charge density is due to the displacement of charges 
which are bound in the atoms of the body. The appearance of the bound, or 
induced, charge is called electrostatic induction or polarization. 

To find Ppoung We shall make use of the definition (2.1) and shall trans- 
form the integral [PdV into the form/s/(P)r dV, where f(P) is a function of 
the polarization. Then, comparing ff(P)r dV with fppounat dV, one can find 
the value of Pyouna- Making use of formula (1.53) and assuming in it that 
a= P, b=r, we find: 


Jnr ds = frcv-Pyav + f (P-v)r av= f r(v-Pyay + [Pav 


Choosing the surface of integration to be outside the volume occupied by 
the body, we have 


(P-n) dS=0, 
since outside the body the polarization is equal to zero. Hence 
fPav=— [rv-Pyav. (2.2) 
Comparing (2.2) with (2.1), we obtain 
Puound > VP- (2.3) 
Thus, if the polarization of the body is not uniform and the vector P 


varies from point to point in such a way that V -P 0, a charge density Pyouna 
defined by formula (2.3) arises in the body. 





402 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


Formula (2.3) can be given an obvious meaning. Let there be inside the 
body two imaginary planes S} and S). In an external field the charges which 
are bound in the atoms confined in the space between the planes are dis- 
placed. If the polarization is not uniform in space, then, for example, the 
surface S5 will be crossed by a larger number of charges going out of the 
volume than the surface S}. As a result, the number of charges going out of 
the volume confined between the planes is larger than that of charges coming 
in from the neighbouring volume. A charge density Ppoung arises in the space 
between S% and S}. 

In particular, when the surface is the boundary surface of the body, then 
the charges arising from polarization form a surface charge. Its density 
bound İS determined in the following simple calculation. We integrate (2.3) 
with respect to the volume confined between the two surfaces. We choose 
these surfaces in such a way that one of them is outside the body and the 
other inside it and that they are at a very small distance A from the external 
surface of the body. 

Then we have 


fovPyav=fPpas= f P,ds - f Pads = 


outside inside 
the body the body 
a7 f P dS == f Ppound dy. (2.4) 
inside 
the body 


Passing to the limit A > 0, we can write 
lim JPvouna dv= f wças 9 
h>0 S 


f P ds> JP aS, 
inside S 
the body 
where S denotes the surface of the body and the value of P,, is taken at the 
surface. In view of the arbitrary character of the surface of the body, it 
follows from the equality 


peo cess 


§3 MEAN CURRENT AND CHARGE DENSITY 403 


that % 
CANS (2.5) 


In this case it is clear that the total charge of an electrically neutral body 
placed in an external field remains equal to zero. Choosing the surface of 
integration in (2.4) to be outside the volume of the body, one can write 


f Pvouna 4¥ = — f(v-P) av = — fe as=0. 


Thus, the total bound charge density arising within the body is equal to the 
total charge induced at its surface. 

Up to now we have assumed the body to be electrically neutral. If the 
total free charge of the body is different from zero and is distributed in it 
with a volume density p, then the mean charge density is 


p= Pbound t P 
and 


JP aV=f (vounate) av= fp av=e. (2.6) 


The charge of the body characterized by the density p is not bound to the 
atoms of the substance and is not induced by an external field. We shall see 
below that in a constant field the free charge p can exist only in dielectrics. 
In conductors free charges are mobile and move until they come to the 
surface of the conductor, forming a surface charge. 


§3. The mean current density and mean charge density in a medium 


The calculation of the mean current density pV in a medium is a somewhat 
more complex problem. This calculation can be carried out either on the 
basis of certain model representations or, more formally, by proceeding from 
the general notions of the electromagnetic properties of a medium. 

We choose the second method here, since the atomic and molecular models 
which are usually employed in expounding macroscopic electrodynamics are 
not only far from reality but also contain of necessity obviously incorrect 
assumptions. As will be explained below, quantum effects play a fundamental 
role in the magnetic properties of atoms. Hence the consideration of the 
electric and, in particular, the magnetic properties of atomic systems cannot 
be carried out on the basis of classical models. In a formal treatment we can 





404 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


restrict Ourselves to one physical assumption oniy: if a body is placed in an 
external electromagnetic field, then the mean field within the body is small 
in comparison with the intra-atomic fields. In other words, we shall assume 
that mean fields inside the body are weak. In addition, we shall confine 
ourselves to cases of homogeneous and isotropic bodies. 

The mean current density, pv, at each point of the body is a function of 
the strengths of the electric and magnetic fields. In addition, if the fields vary 
in space and time, the mean current density will depend on the rate of 
change in time of the vectors E and B and on the spatial derivatives of these 
quantities, i.e. 


p=r (BBE B E i), (3.1) 


Since the fields are weak, one can expand the function f in a series in powers 
of the variables and confine oneself to the first powers of the expansion *. 
This expansion is carried out in powers of a small ratio of the type 
IEVEintat. where Eintat is the strength of the intra-atomic field. 

Expanding f in a series in powers of the variables, we have to take into 
account that pv is a polar vector. Hence all terms of the series must also re- 
present polar vectors. They can be neither scalars nor axial vectors. 

We recall (see §6 of Part I) that the electric field strength e and, conse- 
quently, also the mean electric field strength Ein a medium are polar vectors. 
On the contrary, the magnetic field strength h and the mean magnetic field 
B are axial vectors or pseudovectors. Hence the vector E but not the vector B 
can appear in the required expansion. 

The derivative of the vector E with respect to time, 3E/ðt, is a polar vector 
and must be the first term of the expansion. The spatial derivatives 0£;/0x 
can be grouped in the form of two combinations of the derivatives, V X E 
and V- E. The first combination of the derivatives, V X E, forms an axial 
vector, while the second forms a scalar. These two quantities cannot by 
themselves occur in the expansion. The first derivatives of B are: dB/dr, 
VX B and V - B. The quantity 0B/dr is an axial vector, V- B= 0 by virtue 
of (1.15), while V X B is a polar vector. Hence the vector V X B must be 
retained in the expansion f. Thus, to the first order of small quantities, there 
are only three quantities which are polar vectors: E, 0E/dt and V X B. Of 
course, in the next approximation one can form a number of polar vectors 


* See LE.Tamm, Osnovy teorii elektrichestva (Introduction to the theory of elec- 
tricity) (Gostekhizdat, Moscow, 1954) p. 428. 


§3 MEAN CURRENT AND CHARGE DENSITY 405 


from the scalars and axial vectors, but we shall not be interested in the 
expansion terms of the second order. 

In expanding in powers of the ratio IE|/lEntatl one has, in general, to take 
into account the anisotropy of the body, because E;,,,, varies in different 
directions according to different laws. In isotropic bodies (such are gases, 
liquids and, to a certain extent, polycrystalline bodies) the ratio JEI/IEj,¢ 4¢! 
has the same value in all directions. Then, with an accuracy to terms of the 
first order of small quantities, 


PV = 0E+ x + ac (V XB), (3.2) 


where g, x and @ are scalars depending on the properties of the medium. The 
reason why the factor c (the velocity of light) is introduced into the last term 
will be clear from what follows. 

The zeroth order term (the term containing no field) is absent in expansion 
(3.2), since a mean current equal to zero must correspond to the absence 
of a field. 

Before introducing this expression for pv into Maxwell’s equations, we 
shall discuss the physical meaning of individual terms of the expansion. 

The first term of the expansion, gE, has a very simple and obvious mean- 
ing. It shows that in the presence of an electric field a current arises in the 
medium whose mean density is proportional to the mean field strength E. 
The quantity o is called the conductivity or electrical conductivity of the 
body. 

We shall make use of the notation 


j=oE. (3.3) 


The vector j represents the mean charge passing per second through 1 cm? of 
surface inside the body perpendicular to j. The vector j is called the conduc- 
tion current density. The relation (3.3) connects the current density at a 
given point of the body with the field strength at that point. We shall call the 
relation (3.3) Ohm’s law in differential form. Below, in § 16, we shall establish 
the connection of (3.3) with the usual formulation of Ohm’s law. 

The properties of bodies are characterized by the value of the conductivity 
o. For ideal dielectrics it should be assumed, in accordance with their defini- 
tion, that o=0. For real dielectrics o #0, but is very small in comparison 
with its value in semiconductors and particularly in metals. 

The problem of conductivity in bodies of different natures will be 





406 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


considered below. Meanwhile we note only, that the conduction current is 
due to free charges: free electrons in metals, ions and electrons in gases, ions 
in liquids. They are called free because they are not bound to any atom and, 
under the action of a field, can move over the entire volume of the body. 

We shall denote the mean density of free charges in a body by p. It is 
obviously connected with the current density j by the equation of continuity 


V-j+dp/or=0. (3.4) 

It is here assumed that the free charges existing in a body do not become 
bound and do not arise from bound charges. 

In order to explain the physical meaning of the second and third terms 

of the expansion (3.2) we transform (3.2) in such a way that the vectors E 


and B are separated from each other. We take, for example, the divergence 
of both sides of eq. (3.2). We then have, obviously, 


= > ð 
Vaya atx (VE) 


By virtue of the equations of continuity for the total current pv and the free 
charge current j, given by formulae (1.18) and (3.4), 


G) p= ð 
-5-@-p)=x5, VE. (3.5) 


The difference p — p is obviously equal to the mean density of bound charges 
Ppouna- Thus, for a homogeneous medium we have 


ðP bound _ 


ð 


Integrating, one can write 


Ppound = —V:(XE) . (3.6) 


The integration constant may be set equal to zero, since in the absence of a 
field Ppoung Must be equal to zero. 
Comparing (3.6) with (2.3), we verify, first of all, the formula 


P= xE, (3.7) 


§3 MEAN CURRENT AND CHARGE DENSITY 407 


which relates the polarization in the body to the field strength E. The polari- 
zation P turns out to be proportional to the field E. The factor of propor 
tionality x is called the polarization coefficient or dielectric susceptibility 
of the body. The value of the dielectric susceptibility for bodies consisting 
of simple molecules will be calculated in §12 and §13. There it will be 
shown that x is an essentially positive quantity, so that the vector P is always 
directed in the same direction as the vector E. 

By virtue of (3.7) the second term of (3.2) can be written in the form 


3E _aP 
Xp a (3.8) 


Formula (3.8) shows that the variation of the polarization vector in time is 
equivalent to the appearance of a certain current. This current is called the 
polarization current, and its density is 


jpot = OP/ar , (3.9) 


i.e. is equal to the rate of change of the dipole moment per unit volume at a 
given point of the body. 
Indeed, from the definition (2.1) of the vector P it follows that 


aP 0 Á ðP bound 
Dt aV= 2 fosouna ra= frema dV. (3.10) 


Here we have changed the order of differentiation and integration. The 
quantity r represents the integration variable and does not depend on ¢. 

Making use of (1.5) and applying the relation (2.2) to the vector Ppound¥>» 
we find 


aP 
Sp dV == fr [V C@rouna WIV = ferounaVdV- 3-11) 


The meaning of the density of the current of bound charges is very simple. 
The variation in time of the polarization of a given volume means that bound 
charges are going out of it (or coming into it). It is clear, however, that the 
displacement of bound charges, from the point of view of the electricity 
which they transport, is equivalent to the motion of free charges. 

We now go on to the explanation of the meaning of the last term in (3.2). 
For this we form the vector product of formula (3.2) and r and integrate the 
result with respect to the volume of the body: 





408 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


ftx av= frxjav+ frx ® av+ac frx (vxB) arv. 


By virtue of (3.10): 
əƏP a y 
Sex Fave [rx (5; @vouna n) dV = 


-2 o 
= Sro NaV= = foyouna (rXrndV=0. 


Consequently, 
ac {rx (VXB)dV = [rx (pv-j)dV = fr X (Pyouna AV. (3.12) 


where Ppoung¥ = PY—Jj is the current density carried by bound charges. 
We introduce the notation 


M=aB. (3.13) 


Then (3.12) can be written in the form 
1 
JX (WXMaV => fr X (Proua YAY. (3.14) 
By virtue of formula (1.54), in which a= M, b=r, 


Jrx(vxM) av=— fmxy)x rdV— $(nXM) xX rav. 


If the integration surface is outside the volume occupied by the body, then 
at this surface M= 0 and, thus, 


ftx (VxM) av=— fMxy)x rav. 
Here 
(MX V) X r=(M-V) r— M(V-r) = M— 3M=—2M. 


Hence, finally, 


§4 EQUATIONS FOR THE FIELD IN A MEDIUM 409 
fx (WxXM) av=2 fMmar. (3.15) 


Comparing (3.15) with (3.14), we obtain 
X (p v) 
JMav= (pe pount— dV. (3.16) 


Eq. (3.16) shows that the vector M introduced by the relation (3.13) repre- 
sents the mean magnetic moment per unit volume of the body produced by 
moving bound charges (see §22 of Part 1). 
The statistical theory of magnetic phenomena will be expounded in ch. 6. 
By means of the quantities j, P and M the mean current density in matter 
can finally be written in the form 


py=j+ È + e(VXM). (3.17) 


§4. The system of equations for the electromagnetic field in a medium 


Having calculated the mean values of the quantity pv, we can pass on to 
the final formulation of the equations for the electromagnetic field in a 
medium. Substituting (3.17) into (1.16), we find 


13E 4n ðP 
VX B= j+ Z p l z5 ame ae m™(V XM), 
or 
VX (B-47M) = fT j +12 (E+anP). (4.1) 


We now introduce a new notation, writing 
H= B— 47M (4.2) 
and 


E=D~—4nP. (4.3) 





410 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 
We rewrite (4.1) in the form 


4r 


_4n, 13D 
VA Sar ae ot” 


(4.4) 
We now consider eq. (1.17). By expressing the mean charge density according 


to formula (2.6) in terms of the mean density of the free charge p and the 
mean density of the bound charge Ppoung» and making use of (2.6), we obtain 


V- E= 47p + 4TPpouna = 470 — 47(V - P) . 
Hence, by virtue of (4.3), 
V-D=47p. (4.5) 


The two remaining equations (1.14) and (1.15), as we have seen, are averaged 
without any difficulties: 


~ tas 
VEE an: (4.6) 


V-B=0. (4.7) 


In this case p and j are related by the equation of continuity 


ðP yi 
a NSO. (4.8) 


Eqs. (4.4) — (4.8) form the system of equations for the field in a medium. 
This system of equations was established by Maxwell in 1873 and they are 
called Maxwell’s equations (as distinct from the Maxwell—Lorentz equations 
(1.1) —(1.5) of Part I). It is clear that this system is still not complete, since it 
contains four unknown vectors: the mean fields E and B and the auxiliary 
quantities H and D. For historical reasons we call the vector H the magnetic 
field in the medium, and the vector D the electric displacement. 

In order that the Maxwell system of equations may be complete, it is 
necessary to give certain additional relations connecting the basic and 
auxiliary field vectors. These relations are called constitutive equations. In 
the simplest (homogeneous and isotropic) medium which we have considered 
above, the form of the constitutive equations follows directly from (3.2). 


§4 EQUATIONS FOR THE FIELD IN A MEDIUM 4il 
By virtue of (3.13) and (4.2), we have 

H= B — (4na) B= [1-(47a)] B. (4.9) 

For historical reasons the constitutive equation (4.9) is written in the form 

B=uH, (4.10) 


where the constant u = [1—(47@)]~! is called the magnetic permeability of 
the medium. The induced magnetic moment per unit volume M is expressed 
not in terms of B but in terms of Haccording to the formula 


M= (ua)H=KkH, (4.11) 


where «x is called the magnetic susceptibility. It follows from (4.2) and (4.11) 
that 


wm=1+ 47K. (4.12) 
Analogously, from (3.7) and (4.3) we obtain the constitutive equation 
D=cE, (4.13) 


where the coefficient 


e=1+ 47x (4.14) 


is called the dielectric permittivity or, according to an obsolete but fre- 
quently applied terminology, the dielectric constant. The relation (3.3) is 
also usually referred to as a constitutive equation. We would like to stress that, 
in contrast to the Maxwell—Lorentz equations, which are among the most 
accurate and universal of the known laws of nature, Maxwell’s equations have 
limited applicability in consequence of the limited region of applicability of 
the constitutive equations. We shall dwell on this problem in more detail 
below. 

In what follows it will be shown that for all bodies the electric suscep- 
tibility x is larger than 0 and, consequently, the dielectric constant e€ is larger 
than 1. On the contrary, the magnetic susceptibility can be either positive or 
negative. 

Substances for which k > 0 are called paramagnetic, while those for which 


412 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


k <0 are called diamagnetic. For-what follows we shall need the integral 
form of Maxwell’s equations, which we can write as follows: 


fE- a=-12 fp-as, (4.15) 
fB-dS=0, (4.16) 
fu a= fj-as++2 fp-as, (4.17) 
fD-dS=4n fo av. (4.18) 


In just the same way as has been done for the electromagnetic field in 
vacuum in §10 of Part I, one can introduce the electromagnetic potentials 
y and Ain matter. We define them by the formulae 


B=VXA, (4.19) 
_. Wann 
E= rar ay —V¢. (4.20) 


vi A= 2 (4.21) 


Reproducing the calculations of §10 of Part I, one can easily arrive at 
the equations for the potentials 


yA- ———=_ j, (4.22) 


2 4 
yoa Ea (4.23) 





§5 BOUNDARY CONDITIONS 413 
§5. Boundary conditions 


In what follows we shall usually have to consider electromagnetic phe- 
nomena in bodies which are limited in space. Hence it is necessary to find 
out how the vectors of the electromagnetic field change near the boundary 
of the body. In the most general case the boundary of the body is the inter- 
face of two media with different properties. We shall assume that with a 
sufficient degree of accuracy the interface can be considered as a surface, 
and we shall not be interested in the properties of the electromagnetic field 
in the transition layer near the boundary. 

Let one medium be characterized by e} and u4, and the other by € and 
H2- The behaviour of the vectors of the field at the interface can be estab- 
lished from Maxwell’s equations written in integral form. 

Let us consider eq. (4.15) and apply it to an infinitesimal contour L shown 
in fig. IV.1, assuming the length /, to be an infinitesimal quantity of the first 
order, and the length /, to be an infinitesimal quantity of the second order. 
Then we have 


fE- dl= EOI — EDh + an infinitesimal quantity of the second order . 


Here Eş denotes the component of the electric field strength vector tangent 
to the interface; the indices refer to the first and second media. 
Eq. (4.15) gives: 


1a 


ORO) Nae] 2 
(Eig Eg Vy Oar 


(5.1) 


where ® is the flux of magnetic induction through the surface enclosed by 
the contour L. Obviously, P ~ l} l, and is of a high order of small quantities. 
Hence, passing over to the limit /, > 0, we find from (5.1) 


() = £2) 
Bi Ee (5.2) 


el 
L T 


1 


Fig. IV.1 





414 FLECTROMAGNETIC FIELDS IN MATTER Ch. 1 
The tangential component of the electric field strength remains continuous 
in passing through the interface of the media. 


We apply an analogous method to formula (4.17), dropping at once the 
term of second order in small quantities f D- dS. We then find 


fH- dl=(Hy—H,)- 1a, = * fj-as, 


where l is the unit vector of the contour lying in the plane of the interface 


(fig. [V.1). 
We denote the normal to the interface by n, and the normal to the surface 


enclosed by the contour | by nj. The vectors n, n; and | form a right-handed 
screw system 


l=n, Xn. 


Further, we introduce the concept of surface current density jg, by which is 
meant the amount of electricity passing per second through unit length on the 
surface 


lim j-n;) d}; dh =(jg-n) dl, . 
a 1) dl; diy = (js: n) diy 


Then (4.17) gives 
=Ma = 

fH- dl=— Gg -ny)aly = (Hy—Hy) lal, , 

or 
4n 

(H,—H,):1= T Gsm). 
Expressing l in terms of n}, we have 

(H2-H;)- (n; X n) =n; [nX(H,—H))] , 
hence 


ni: (9X(Hp—H))] = 2 Gsm). 


§5 BOUNDARY CONDITIONS 415 


Since the orientation of the vector n; in the plane of the interface can be 
arbitrary, the following equality must be fulfilled: 


nx (Hj—Hy) = “jy. (5.3) 


In the presence of surface currents the tangential component of the field 
strength H is discontinuous at the interface of the media. The value of the 
discontinuity AH g is equal to (47/c)jg. 

If there is no surface current at the interface, jg = 0, then 


a) = y) 
H Hg S (5.4) 
The boundary conditions for the normal components of the induction 
vector B and the displacement vector D are obtained from (4.16) and (4.18), 
if the infinitesimal surface S shown in fig. IV.2 is chosen as the integration 
surface. The area S} of the bases is an infinitesimal quantity of the first 


order, while the area of the lateral faces is an infinitesimal quantity of the 
second order. From (4.16) we find 


(BY—B) + an infinitesimal quantity of the second order = 0 , 
or, passing over to the limit, 
0) = p®) 
Ba Bre. (5.5) 


The normal component of the magnetic induction vector is conserved in 
passing through the interface. 





Fig. IV.2 


1a 
| 
i 


416 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 
Analogously, from (4.18) we find 
2 = 
DY =D?) + ares , (5.6) 


where ag is the surface charge density defined, as above, by 


Jim Joav= fosas. 


If the surface charge density is wç = 0, then 
() = p 
1D NDS (5.7) 


Eqs. (5.2), (5.3), (5.5) and (5.6) are the boundary conditions which must 
be satisfied by the field vectors at the interface of the media. In particular, 
at the boundary with a vacuum one must assume in these formulae that 


EJE HE ls 


§6. The limits of applicability of the system of constitutive equations 


We have obtained Maxwell’s equations (together with the constitutive 
equations) from the Maxwell—Lorentz equations by means of averaging, based 
on certain assumptions about the properties of the medium. 

Although these assumptions have a rather general character, they are not 
always fulfilled. It turns out that the region of applicability of the constitutive 
equations and, consequently, Maxwell’s equations in the simplest form (4.4) — 
(4.7) is rather limited. We have to stress these limitations particularly because 
nowadays one has to deal more and more often in physics with systems to 
which Maxwell’s equations in the form we have written down are not appli- 
cable. Usually one of the terms on the right-hand side of eq. (4.4) is large 
while the other is small. Thus, in ideal dielectrics the conduction current is 
j=, and in real dielectrics it is very small in comparison with the displace- 
ment current 0D/dt. On the other hand, in conductors the displacement 
current is usually small. Therefore eq. (4.17) should be considered as a 
general expression which includes limiting real cases but which is almost 
always simplified in actual use. However, they are presented in all the old and 
many new textbooks without adequate reservations. 

We now go on to the discussion of the assumptions made in deriving 
Maxwell’s equations. The basic assumption is the expansion (3.2). This ex- 


§6 APPLICABILITY OF THE CONSTITUTIVE EQUATIONS 417 


pansion means, first of all, that the medium is isotropic, so that the connec- 
tion between the vector pv and the vectors E, dE/dr and V X B is charac- 
terized by the scalar constants o, x, u. It is clear that this assumption is not 
applicable to anisotropic media, in particular to monocrystals. Hence in ani- 
sotropic media the relations (3.3), (4.13) and (4.10) do not hold and must be 
replaced by the tensor expressions 


Ji” OE, > D; = EREK A B; = Biz z (6.1) 


The quantities €; and uy are symmetric second-rank tensors. The proof of 
this statement will be given in § 12. 

Further, the very existence of a linear relation between Mand B or, what 
is the same, between B and H, turns out not to be fulfilled for a relatively 
small but very important group of substances which are called ferromagnetics 
and antiferromagnetics. For these substances the functional relationship 
between B and H has a complex non-linear and ambiguous character. We 
shall dwell on the theory of ferromagnetism below, cf. §20. 

Another group of substances, called ferroelectrics, possess electric proper- 
ties which are analogous to the magnetic properties of ferromagnetics: the 
dependence of Don Eis non-linear and ambiguous. 

Thus, for ferromagnetics and ferroelectrics the expression (3.2), in which 
the first terms of the expansion are retained, loses its meaning. 

A number of substances, called superconductors, show a profound change 
in their magnetic properties at low temperatures. The magnetic field does not 
penetrate inside superconductors, so that in the volume of a superconductor 
the following condition is fulfilled: 


B=0. 


It is obvious that Maxwell’s equations in the form (4.4)—(4.7) do not describe 
the behaviour of superconductors. The properties of superconductors will be 
considered in §21, while the theory of superconductivity will be given in 
Part V. 

The problem of the applicability of the expansion (3.2) in the case of 
high-frequency fields and for spatially non-uniform systems is more complex. 

In considering high frequencies the division of the current into two parts, 
the current of free charges and the current of bound charges, ceases to have 
any meaning. 

It is clear that in a high-frequency field the free charges as well as the 
bound charges perform practically the same oscillatory motion. Reasoning 





418 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


further quantitatively, one can say that the expansion (3.2) assumes that the 
variation of fields in space must be rather smooth and occur over distances 
which are large in comparison with molecular dimensions. Otherwise the 
macroscopic averaging over the volume, which must include a sufficiently 
large number of molecules, would be meaningless. Since the variation in space 
takes place over a wavelength, the inequality >a must be fulfilled and, 
consequently, we are confined to frequencies which are much lower than 
wg $ c/a. 

However, this restriction is still not the strongest one. In the range w < wg 
there is a wide interval of frequencies in which the constitutive equation (4.13) 
turns out to be inapplicable. This equation, as well as (3.2), assumes that the 
frequency of the field is small compared with the inverse of the relaxation 
time 77! which is characteristic of the given substance. In this case the 
polarization P at a given point of space at a particular time ¢ is determined by 
the displacement (mean field) at the same point and at the same instant of 
time. But if the frequency is comparable with the inverse of the relaxation 
time 77!, then the polarization will lag behind the field and will become de- 
pendent on the history of the process. Then instead of (4.13) we shall have 
to write the more general relation 


t 


D(r,t) = ff e(t,t’) E(r,t’)dr' , (6.2) 


-00 


where the integration is carried out with respect to past times (t’<s) and the 
function e(t,t’) determines the production of polarization as a result of the 
action of the field at preceding instants of time. Since time flows uniformly, 
there is no unique instant of time. This means that the function e(z,t") can 
depend only on the time which passed between the instant ¢’ and the instant 
t, in other words, on the argument f — t'. Thus 


t 


D(r,1) = ff e(t—1') E(r,t')de’ . (6.3) 


-00 


We note that the frequencies for which eq. (4.13) is inapplicable are very 
small in comparison with wọ. Indeed, relaxation times 7 are always in order 
of magnitude equal to characteristics atomic times T,, ~ a/v, where v is the 
velocity of electrons in atoms. Hence 


v 


§7 THE LAW OF CONSERVATION OF ENERGY 419 


In §32 we will show that in the frequency range approaching characteristic 
atomic frequencies the dielectric constant is dependent on the frequency. 
This phenomenon is therefore called frequency dispersion or time dispersion. 

Another phenomenon, called spatial dispersion, also occurs at high fre- 
cuencies. In order that eq. (4.13) may apply the wavelength must be large 
not only in comparison with the size of the atom but also in comparison with 
the size of any region of spatial non-uniformity in the substance. 

If the size of a region of non-uniformity is denoted by /, then it is necessary 
that the inequality A >/ be satisfied. If A =Z, then the polarization at a given 
point of space will depend on the value of the field at neighbouring points of 
space at preceding instants of time, i.e. 


p 
D(r,t) = far fert) EC av". (6.5) 


Formula (6.5) means that a contribution to the polarization at a given point 
at a given instant of time is produced by charges which have earlier been at 
neighbouring points of space. This phenomenon is called spatial dispersion. 
Media with a spatial dispersion are encountered in a number of important 
cases (e.g.: plasma, metals). All that has been said about the time dispersion 
and spatial dispersion of dielectric permittivity also holds for magnetic 
permeability. 

We thus see that the way of writing the constitutive equations and, conse- 
quently, Maxwell’s equations as presented in §4 is the simplest form and is 
applicable only to the case of isotropic media which do not display a time 
dispersion or spatial dispersion and are neither ferromagnetic nor ferroelectric. 


§7. The law of conservation of energy in a medium 


The derivation of the law of conservation of energy for a non-absorbing, 
isotropic and non-ferromagnetic medium does not differ from the analogous 
calculations carried out in §12 of Part I. Namely, if all bodies (conductors 
and dielectrics) which are placed in a field are at rest, then the work done 
by the electric field on the charges per unit time is equal to 


Wee ke E ee 
Sfi Edy =-& Ex Mav — z JE g v= 


-G fi ’ all 9D | yy, 2B 5 
= £ flewxt-HwxBlay—tf(E avi B) av 





420 ELECTROMAGNETIC FIELDS IN MATTER Ch. 1 


=~ fo- aS- fÈ ugar. 


Here we have added an expression equal to zero to the integral to be cal- 
culated by virtue of (1.14), and have denoted the energy density of the 
electromagnetic field in the medium by ug 


_ eE? + uH? 
ug a A (7.1) 
and the Poynting vector by © 


EX H=- EXB. (7.2) 


c 
o = 
4n 4npu 


It is convenient to write the energy conservation law in the form 
ð dw 
SS = - t+t—. i 
FF; fuo dV fo dS+ -37 (7.3) 


In differential form the energy conservation law is expressed by the 
relation 


2 2 
aS sea =j- E+y 0. (1.4) 
The interpretation of the conservation law for a medium does not, in 
principle, differ from that given in ch. 1 for a vacuum. All one should note 
is the fact that (7.4) contains only the free charge current density j. As free 
charges are moving in a conductor at rest the entire mechanical work of the 
field is transformed completely into heat, called the Joule heat. The mecha- 
nism of this process depends on the actual properties of the conductor. We 
shall touch upon this problem later. Denoting the Joule heat released in 
unit volume per second by Qg equal to 


Qo =ijs E=}j?/o > 


we can write the law of energy conservation in the form 


-2 fug av= fO dv + fo- as. 


— 


§7 THE LAW OF CONSERVATION OF ENERGY 421 


The quantities 


72 ; 
e£*_D-E aad 


cE? uH? _B-H 
87 8r 8r 8m 








represent respectively the densities of the electric and magnetic energies. 

Since, however, the quantities e and u in matter are functions of temper- 
ature (see ch. 6), our overall treatment assumes the constancy of the temper- 
ature of the medium. Hence the quantity (E- D+ B- H)/87 must be interpreted 
as the free energy per unit volume of the medium. 








Electrostatics 


§8. The electrostatic field 


The equations of the electromagnetic field are greatly simplified in the 


case of electrostatics. 
Assuming that the fields do not depend on time and that the current is 
equal to zero, one can write Maxwell’s equations in the form 


y X E=0, V- D=4rp, (8.1) 
v-B=0, VX H=0. (8.2) 


We shall not consider constant magnetic fields in matter *. 

Eqs. (8.1) and (8.2) determine the electrostatic field in a medium 
completely. 

It should be noted that p denotes the mean density of free charges which 
are introduced into the dielectric from outside. In contrast to conductors, 
free charges in a dielectric are at rest and fixed in definite positions. 


* See I.E.Tamm, Osnovy teorii elektrichestva (Introduction to the theory of elec- 
tricity) (Gostekhizdat, Moscow, 1954). 


422 


§8 ELECTROSTATIC FIELD 423 


If a dielectric is placed in an external electric field, the bound charges 
contained in the molecules, or ions forming the ion lattice, are displaced with 
respect to each other in such a way that the dielectric becomes polarized. 
For non-uniform polarization, a bound volume charge arises in the dielectric 
whose mean density is determined by formula (3.6). We recall that the total 
bound charge which appears in the polarized dielectric is always equal to 
zero. 

In addition to the volume density of bound charges, a surface charge 
density wş = Pp arises at the surface of a polarized dielectric. 

The field inside a dielectric is, according to (8.1), irrotational. The elec- 
trostatic field potential, defined by (6.2), satisfies the equation 


V-(eEVy) =—4np . (8.3) 


In a homogeneous dielectric e = const, and the last equation becomes the 
Poisson equation 


V2y=—4np/e . (8.4) 
From this equation it follows that for a homogeneous dielectric the potential 


of the field produced by the free charge volume density p can be written in 
the form 





-!paV 
yas = (8.5) 


Formula (8.5) shows that the potential of the field and the field itself in a 
dielectric medium are decreased by a factor of e in comparison with the field 
produced by the same charges in vacuum. In particular, the field of a point 
charge in a dielectric medium is equal to 


Be 
eR? 

The corresponding force of interaction of charges according to the Coulomb 
law is smaller in a homogeneous dielectric medium than in vacuum by a 
factor e. This result has a simple meaning: the electric field produced by free 
charges which are put in a dielectric medium gives rise to its polarization. As 
a result of the polarization the bound charges are displaced and produce in the 
medium a field which decreases the field of the free charges. 





424 ELECTROSTATICS Ch. 2 


It should be stressed that this refers only to the field of charges which 
are put inside a homogeneous and isotropic dielectric. From examples which 
will be discussed below it will be clear that this conclusion is quite inappli- 
cable to an inhomogeneous dielectric. 

At the interface of two dielectrics the potential must satisfy the following 
boundary conditions: 


Cite 22> (8.6) 


ðp\ _ ay 
€ (3). =€5 (32) : (8.7) 


The first of these is equivalent to (5.2), while the second follows directly 
from (5.6) if the surface density of free charges is assumed to be wy = 0. 

Representing the field distribution in an obvious way by means of lines 
of force, we see from the boundary conditions (8.6) and (8.7) that the lines 
of force are broken at the interface of two dielectrics and that 


tga, & 
tga, €> (8.8) 


The meaning of the angles a, and a is shown in fig. IV.3. 

Let us now consider the electrostatic field in conductors. As we have 
already stressed, a distinctive feature of conductors is the presence of a large 
number of mobile charges in them. The presence of a mean field E inside 
a conductor always produces an electric current in it, representing the motion 
of free charges under the action of the field. If current is absent in a conduc- 






ay 





| 
E2 | 
| 
| 


Fig. 1V.3 


§8 ELECTROSTATIC FIELD 425 


tor, and this is just the requirement of electrostatics, then inevitably the field 
strength in the conductor is also equal to zero. 


E=0. 


From the equality of the field to zero in the entire region of space occupied 
by the conductor it follows that the volume charge density inside the con- 
ductor is also equal to zero. 

The reduction to zero of the electrostatic field strength vector in the 
conductor can be presented in the following obvious way. When a conductor 
is placed in an external field, free charges in it begin to move towards the 
surface, and this displacement continues until the field of the surface 
charges completely compensates for the field inside the conductor due to 
external sources. 

The process proceeds as though the conductor possesses an infinitely 
large polarizability for any value of the field strength or, as is seen from (8.7), 
an infinitely large dielectric constant € > °°. As is seen from (8.8), the lines 
of the electrostatic field are normal to the surface of the conductor. Although 
the identification of the conductor with a dielectric having an infinitely 
large permittivity has a formal character, the concept of polarization is not 
applicable to free charges in conductors and is only useful for an obvious 
interpretation of the relations of electrostatics. 

The value of the surface charge density for conductors is obtained from 
the boundary condition (5.6), if the field strength inside the metal is 
assumed to be equal to zero. Then we find 


ws = eE l4n , (8.9) 


where e is the dielectric constant of the medium surrounding the conductor, 
and Ep is the normal component of the external field at its surface. The 
boundary condition (5.2) gives for the tangential component of the external 
field at the surface of the conductor 


Eg =0, (8.10) 


since in the conductor itself it is equal to zero. 

Thus, the field external to the conductor is directed perpendicularly to its 
surface and £,, =Æ. Formulae (8.9) and (8.10) can be rewritten by introduc- 
ing the external field potential y in the form 


i 


a 


a 
mm 


426 ELECTROSTATICS Ch. 2 


€ dy 
Oy 2 = as 


- 47 On’ 
at the surface of the conductor . (8.11) 


Y = Ym = const 


In particular, if the metal is in vacuum, the value of e must be set equal to 
unity. 
The total charge at the surface of the conductor is 


AARE op 
E40), e 


where the integration is carried out over the entire surface S of the conductor. 
Formula (8.12) establishes the relation between the surface potential y,, of 
the conductor and its charge e. It is easy to verify that this relation has a 
linear character: if y is increased by a factor k, then by virtue of (8.12) and 
the constancy of the potential along the surface of the metal its charge e will 
increase by the same factor. 

The ratio of the charge of a conductor to its potential is called its 
capacitance: 


-£-£ gl (d% 
Olde aah (54), as. (8.13) 


m 


The capacitance is proportional to the dielectric constant e of the medium, 
and in other respects is determined solely by the form of the conductor. 

As is seen from (8.13), the capacitance has the dimensions of length, and 
in order of magnitude is the same as the linear dimension of the conductor. 
In the case where there is not one but several conductors in the electrostatic 
field, the boundary conditions (8.13) must be fulfilled for each of these. 

In this case the charges on the conductors and their potentials are not 
arbitrary. There is a certain relation between them which follows from the 
uniqueness of the solution of the Laplace equation with boundary conditions 
of the type (8.11). From the mathematical standpoint finding the influence 
of one conductor on the others is a complex problem, and we shall not 
touch upon it *. 


* See, for example, W.R.Smythe, Static and dynamic electricity (McGraw-Hill, New 
York, 1950). 


§9 SOLUTION OF ELECTROSTATIC PROBLEMS 427 
§9. The solution of electrostatic problems 


We can now formulate the problem of electrostatics in a general way. 

Let a system of conductors and dielectrics of different natures be given 
in a finite region of space. Let the volume charge density p at all points of 
space also be given (in the volume occupied by the conductors p = 0). Then 
the equation for the potential has the form 


Vo = —4np/e; 


(i corresponds to the medium; in vacuum e= 1). 

At the interfaces vacuum—dielectric, vacuum—conductor and dielectric— 
conductor of the media the system of boundary conditions which have been 
discussed is fulfilled. The potential at infinity satisfies the condition 


oale r] (a>0). (9.1) 


pita 


We wish to find the potential distribution and electric field distribution 
over all space. This problem is often called the direct problem of electro- 
statics. The inverse problem is to find the charge distribution from the known 
potential distribution. The direct problem of electrostatics is one of the basic 
problems of mathematical physics. The electrostatic problem can be solved 
by elementary methods only for particular simple cases. A number of 
methods for its solution have been devised, and solutions for a great number 
of actual systems have been calculated. In these an important role is played 
by approximate and numerical methods. Our purpose does not include the 
analysis of different cases of electrostatic fields. We shall consider only some 
simple examples of general interest. 

The solution of Poisson’s or Laplace’s equations for given boundary 
conditions represents one of the most important problems of mathematical 
physics. 

Standard methods, exact as well as approximate, have been devised for 
the solution of these problems. The solution of the boundary value problems 
of electrostatics is of great practical importance, but is of no great interest 
from the point of view of theoretical physics. Hence, we refer the reader to 
handbooks of mathematical physics and we shall confine ourselves only to 
the briefest exposition. 

First of all, we point out the existence of two types of boundary condi- 
tions. If the region in which the potential is to be found is bounded by 





428 ELECTROSTATICS Ch. 2 


metal conductors at the surface of which the potential (i.e. the function 
sought for) is defined, then it is said that Dirichlet’s boundary conditions 
are defined. But if the boundary surface is a conductor with a given surface 
charge density, then according to (8.11) the derivative of the potential with 
respect to the normal is defined. In this case the boundary conditions are 
called Neumann’s conditions. 

The definition of the Laplace (or Poisson) equation together with the 
boundary conditions at all boundary surfaces and at infinity defines the 
boundary value problem. 

It can easily be shown that the boundary value problem with Dirichlet’s 
or Neumann’s conditions has a unique solution. 

We shall dwell very briefly on the solution of boundary value problems 
for Laplace’s equation, when volume charges are absent. 

There are a number of methods for the solution of such equations. We 
shall mention only some of these. 

If the boundary value problem has a particular simple symmetry, then the 
Laplace operator is most conveniently written in coordinates which express 
the symmetry of the system. Thus, for example, to find fields with a spherical 
or cylindrical symmetry one has to express the Laplace operator in spherical 
or cylindrical coordinates. 

In such cases the variables are separable and the solution of the equation 
with partial derivatives reduces to the solution of equations in total deri- 
vatives. There are eleven different coordinate systems in which the Laplace 
operator in three dimensions allows one to separate the variables. They are 
given in the book of P.M.Morse and H.Feshbach *. 

As the simplest examples we shall consider the solutions of Laplace’s 
equations in Cartesian coordinates. In these coordinates Laplace’s equation 
has the form: 


2 2 2 
ORPO EOIROS OIE (9.2) 
ax2 ðy? dz? 


Assume that the boundary conditions are given at surfaces which have in 
Cartesian coordinates a simple form, for example, 


*P.M.Morse and H.Feshbach, Methods of theoretical physics (McGraw-Hill, New 
York, 1953). 


§9 SOLUTION OF ELECTROSTATIC PROBLEMS 429 


y=9\(%,y) for Zi— On xy SG 
y>0 zZ> oo, (9.3) 
y=0 xXx=+tL,y=2L. 


These conditions mean that at the surface of an open parallelepiped the 
potential is equal to zero at the lateral walls and is defined at the base. 

Then eq. (9.2) is easily solved by the method of separation of variables. 
We shall seek the solution of (9.2) in the form of the product 


p= X(x)YW)Z(Z) , (9.4) 


where X, Y, Z are functions of one variable. The substitution of (9.4) leads 
to the equation 





1d? mieza 
Y dy? Z dz2 





(9.5) 


Since each of the terms depends on its argument the equality (9.5) is ful- 
filled if they all are equal to constants, i.e. 


2 25 
yo =-a2, uae Pa a ea 
dx2 dy 2 Z dz2 
a2 + B2 = y? 
Consequently, the solution of (9.4) has the form 
Y = A (sinex sinBy) e% . 
The boundary conditions (9.3) are fulfilled if 
S = mn Sh 2em AN = 
a Si B ea L” +m?) ; n m=l,2,3,... 


so that 


p 


430 ELECTROSTATICS Ch. 2 


= . [nī . [MT MEAL 
Y= Dla sin( L x) sin (7 ») exp| L (n-+m~)2 z|: (9.6) 





The coefficients A,,,, must now be chosen in such a way that the following 
condition be satisfied: 


yi&y)= Do sin (z x) sin (22 ») ; (9.7) 


This can always be done, since the sines form a system of orthonormal 
functions and by multiplying (9.7) through by sin (nnx/L) sin (mny/L) and 
integrating with respect to x and y one can write 


1 +L +L 

. (nt ma 
Anm = — d ») sin(“™ x) sin (27 y 
mee J sah ry) sin( 7 x) sin (27 >) dy . 


§ 10. The method of images and method of inversion 


In solving a number of electrostatic problems it is useful to employ the 
uniqueness theorem. Indeed, suppose that for a given system we managed 
to choose potentials or electric fields which satisfy the differential equations 
and boundary conditions. By virtue of the uniqueness theorem the solution 
found can be considered as correct irrespective of the way in which it is 
obtained. As an example we consider the following problem. 

Let an infinite plane x = 0 be the surface of a metal which occupies a half- 
space x <0. In vacuum, at a distance a from the metal—vacuum boundary, 
let there be a charge e. We require to find the field strength over all space. It 
is obvious that the electric field inside the metal (region x < 0) is equal to 
zero. 

Poisson’s equation for the potential in the region x >0O has the form 


V2y = —47e5(r—a) . (10.1) 
The boundary conditions for our problem have the following form: 
y=O as ro. 


In the plane x = 0 the following relations must be satisfied: 


—_=— - 


§10 METHODS OF IMAGES AND OF INVERSION 431 


(2) = ni) as 
Da DA + nws = 4nws , 


(10.2) 
ED = £2) 
tg Ee 


Here the fact that D®) =0 in the metal and at its surface is taken into ac- 
count. The condition for Ey, is equivalent to the requirement of the constancy 
of the potential at the surface of the metal. If we find the electric field out- 
side the metal, then the boundary condition (10.2) will allow us to determine 
the surface charge. 

To obtain the solution we note that eq. (10.1) and the condition of the 
constancy of the potential at the surface x = 0 are satisfied by the field 
potential of two charges, one of which (e) is located at the point x = +a and 
the other (—e) at the point x = —a: 


e e 


oe (10.3) 


where r and r} are the distances from the charges to the point of observation. 
By virtue of the uniqueness of the solutions of Maxwell’s equations, formula 
(10.3) gives the distribution to be determined. 

The calculation of the gradient of y allows one to determine the electric 
field E. 

We now pass on to the treatment of the more complex case where the 
plane x =O is the interface of two dielectrics. The half-space x > 0 and 
x <0 are occupied by dielectrics with dielectric constants €} and € respec- 
tively. At the point x =a there is a charge e. We have to find the distribution 
of the field over all space produced by the charge located at the point x =a. 

The differential equation for the potential in the region x > 0 has the form 


V2y=- C Sean 
Gil 


We assume that there are no free charges at the interface, so that the boundary 
conditions can be written in the form 


y=O as r-e 


p2) = pl) 
=p% | (10.4) 
42) = g0) 
Baie, (10.5) 


tI ee hi 


oS 





432 ELECTROSTATICS Ch. 2 


We shall try to solve this problem by analogy with the preceding one, 
assuming that the field potential is the same as the potential of the field of 
equivalent charges at the points x = a and x = —a. 

We assume that the field in the region x > Q is equivalent to the field of 
the two charges: charge e located at the point x =a and an unknown charge 
e; located at the point x = —a. In this case the field potential in the region 
x > 0 has the form (fig. 1V.4) 


e 
aS er (x>0) . (10.6) 


Further we assume that the potential in the region x < 0 is also represented in 
the form of the field potential of a point charge 
2 =e/r, (10.7) 
where e, is a certain unknown charge located at the point x =a. 
It is obvious that the potentials y} and yp are the solutions of the equa- 
tions of electrostatics for the corresponding point charges. The electric fields 
in the two regions have the form 





Fig. 1V.4 


§10 METHODS OF IMAGES AND OF INVERSION 433 


Substituting these values into the boundary conditions (10.4) and (10.5), we 
find 


65, aE 
A 


ealr)n [< ave) 


er re 


ey (ty Ng A C(T tp £ e2(T)ig } 


3 
1 





r er rp 


Here (r),,, (T)te and (r,),, (i tg are respectively the normal and tangential 


projections of the vectors r and ry. 
In the plane x = 0 the following obvious relations hold: 


Irl=Iryl, (ry Jg = Mie > (Typ = Mh 


Hence we obtain two equations for the determination of the charges e; and 
e7: 


e2 =e; + (e/e1), 
eE] Sei E9e 5k; 
from which it follows that 


2e 
€] +e’ 





e2= 


(€,—€2)e 
= N 
1 €1(€; +€2) 
Thus, the field potential is expressed by the formulae 


e , (1-€2) e 


y= Gi CGEA a (x>0) , 
i wl 200e 
emer tien CY) 


The method of images is applied in an analogous way to obtain the solution 
of problems which are more complex from the geometrical point of view. 


Ee 


434 ELECTROSTATICS Ch. 2 


Another useful method of solving electrostatic problems is the method of 
inversion. 

If W7,8,W) is a potential satisfying the Laplace equation, then a simple 
calculation shows that the function 


R (R? 
Y) =R oy), 


also satisfies this equation. The relation 

yı ty=0 
is satisfied on a sphere of radius r = R. From the geometrical point of view the 
transformation r > R2/r represents the mirror reflection in a sphere of radius 
R. 

As a simple example of the application of the method of inversion we give 
the problem of the potential distribution in a system of a point charge and a 
grounded sphere of radius R. We assume the potential of the sphere to be 
equal to zero. If the charge e, is located at a distance p, from the centre of 
the sphere, then the potential of the field produced by this charge and by the 
charge induced on the sphere must be zero at the surface r= R. This condi- 
tion is satisfied by the potential outside the sphere, given by the formula 


if the unknown quantities, the charge e, and its position, are chosen 


properly. 
We assume that a fictitious charge is placed at a point a distance py from 
the centre of the sphere and that 


P\P2> R2 
If the absolute value of this charge is equal to 
ib 
ez =e; (P/P), 


then the two charges produce a potential equal to zero at the surface of the 
sphere. Thus, 
A ej Re, 


ry PD 





§11 ENERGY OF A SYSTEM OF CONDUCTORS 435 
§ 11. The energy of a system of conductors 


We now pass on to the consideration of energy relations in the electrostatic 
field. The energy relations have their simplest form for conductors. 

Since the field does not penetrate inside conductors, their thermodynamic 
properties do not change. However, the field in the space surrounding a 
conductor depends on the presence of the conductor, its size and so on. 

We shall call the energy of the electromagnetic field surrounding a con- 
ductor the electromagnetic energy of the conductor. The total energy of a 
conductor is equal to the sum of its internal and electromagnetic energies. 

We shall only be interested below in the latter, and for brevity we shall 
speak simply of the energy of the conductor. 

We assume first that the conductors are in vacuum. The total energy of the 
electrostatic field is 


=x ferav. 


We transform the integral, writing it in the form 


l P | ; g 
U=- JE Voas- f VBD fow E) dV = 


1 1 
-i [based D fo os) 
co l 


The integration is carried out with respect to all surfaces bounding the inte- 
gration volume. These are the external surface of radius R >œ and the sur- 
faces of the conductors. Since the external normals of these surfaces are 
oriented in opposite directions, different signs occur in front of the surface 
integrals. We have also made use of the fact that in vacuum 


V-E=0. 


The integral with respect to the infinitely distant surface is equal to zero 
by virtue of (9.1). At metal surfaces Ym; = const, hence 





436 ELECTROSTATICS Ch. 2 


1 1 
= 2 jok SO a 





OYm- 

1 m ji 

=- Dom S| a) d5;=5 27 ewm, 
l l 


The energy of a system of conductors is formally the same as the energy of a 
system of charges. 

We now assume that the space between the conductors is filled with a 
dielectric. Then two cases are, in principle, possible: 
(a) the conductors are insulated, so that the charges of all the conductors 
have a constant value, 
(b) the conductors are connected to devices which maintain their potentials 
constant. 

In the first case, as is seen from formula (8.12), we have 





e=- z a se as, 
so that 

y =y/e. 
Correspondingly 


y' ] 
oP = Dy eim ~ 
l 


The fields and energy are decreased by a factor of e. The energy is used in 
the work done in filling the space with the dielectric. 

In the second case, after filling the interspace with the dielectric the 
charge of each of the conductors will be equal to 


=E (VSE 
4n 5. SG 
and the energy to 


U'=eU. 


un 


$12 DIELECTRICS AND CONDUCTORS IN EXTERNAL FIELD 437 


The increase in the energy and the work done on the dielectric arise from the 
devices which maintain the potentials of the conductors constant. 


§ 12. Dielectrics and conductors in an external electrostatic field 


If an uncharged conductor or dielectric is placed in an external field 
EG, the configuration of the field near the body will change. This change 
depends essentially on the form of the body and the character of the external 
field, as is clearly seen from the examples which are discussed below. 

We shall consider first the case of a dielectric. 

In what follows we shall assume the external field to be uniform over the 
extent of the body. We indicate by the index i all quantities which refer to 
the region of space inside the body. 

As the simplest examples we shall consider a long cylinder whose axis 
is oriented along the field and a thin parallel-sided plate situated perpendi- 
cularly to the field. 

In the first case the polarization of the dielectric will give rise to surface 
charges at the bases of the cylinder. However, since the length of the cylinder 
is large, these charges will produce only a weak field which does not distort 
the field due to external sources inside the cylinder. This is seen from the 
boundary condition (5.2) which gives 


70 = pe) 
Eg Eg 
and, since E is, by symmetry, parallel to the axis of the cylinder, 
E= BO : (12.1) 


In the second example, surface charges produce an appreciable weakening 
of the field inside the plate. The boundary condition (5.7) gives 


(i) = FO) 
D® E5 7 


or (12.2) 
EÔ = E©) — 4P. 
0 


The field inside the plate is reduced by an amount 47P. This inference, as 
is shown by calculation, is preserved qualitatively for bodies of a more com- 
plicated form. 


As the next example we consider a dielectric sphere placed in a uniform 


p) 


438 ELECTROSTATICS Ch.2 


external field E®. The potential distribution will be determined from the 
solution of Laplace’s equation 


V2y=0, 
which satisfies the requirement 
y=- EO or (12.3) 


at a large distance from the sphere (r>R) and satisfies the boundary condi- 
tions (8.6) — (8.7) at its surface. We seek the solution in the form 


GO =, +0 (7,0,EY), 


where Yo = EP- r, and gh? represents the change in the potential near the 
sphere. 

The function op must decrease for r>R. Since yP depends only on 
the vector E® and on the scalars r and 6 and must itself be a scalar, one can 
write it in the form of a combination of the vectors BY and r or Be and the 
gradient of r-!. In the first case yP would not satisfy the A, of 
decrease at infinity, so that the only possible form of oe is the following: 


pË =aE® (vr), 
where & is a constant. Hence 


y® =—E® -r +EP -(vr-}). (12.4) 


On the contrary, inside the sphere the potential y® at all points must 
remain finite and hence must be written in the form 


y® =6(EO-r) . (12.5) 


It is obvious that both (12.4) and (12.5) satisfy Laplace’s equation, and that 
behaves at infinity in accordance with (12.3). 

To find the two unknown quantities a and B one can make use of two 
boundary conditions: 


DIELECTRICS AND CONDUCTORS IN EXTERNAL FIELD 439 


un 
N 


y® = ®© 3 


forr=R. 
e® ae = efe) ag 
or or 


Elementary calculations, which are most conveniently carried out in spherical 


coordinates, give 





_ 6&) — 3 
eY + Qe) ; 
© _ e) (i) 
ge (ysiecaet) omen 
e® + 2) e® + 26) 


so that finally 


POSE (HOw EDK (12.6) 
0 A 
Y =- (Br) + (Pr), (12.7) 
where 
(i) — ¢(e) 
ey EE O) (12.8) 
4n i) +2 0 
= 47 R3 
V 3 R 
In particular, in vacuum e©) = | and 
-3 O=-1p@__3k__ po, (12.9) 





pe 
An +2 9 +2 


0 
If the susceptibility x of the body is small, so that e® ~ 1, then 
~ (e) 
Pœk Eg- 


Hence the field strength outside the sphere is 








440 ELECTROSTATICS Ch. 2 
h 2 
E© = — Ve = EO + Poa sree (r<R); (12.10) 
r 
EO =— vy =— $r P+ E® (r<R). (12.11) 


In calculating the gradient according to (1.47) we have taken into account the 
constancy of the vector P. 

From formulae (12.6) — (12.11) it follows that 
(1) in a uniform external field a sphere becomes polarized, acquires a dipole 
moment (P-V), and produces an additional field identical with the field of a 
dipole placed at the centre of the sphere. 

(2) inside the sphere the field has a constant value and is weakened by a 
factor of 3e/(e+2e©) in comparison with the external field. 

Let us now consider an uncharged metal sphere placed in a uniform 
external field £}. The field outside the sphere will be found from the solu- 
tion of Laplace’s equation satisfying the condition (12.3) at infinity and the 
condition 


= Ym =0 (12.12) 


at the surface of the metal (the potential y,, can be assumed to be zero). 

One can, however, simply assume that in the formulae for y©), E® and P, 
which have been obtained above for a dielectric sphere, the dielectric per- 
mittivity inside the sphere e® tends to infinity. One then obtains immediately 
the expression 


R3 -P) V 
(ec) =_ -© R) =— F®). @:P) V 
y Ey rh 3 cos@ E5 r+ ane 


where it is assumed that 


-3 Fe 
P mE 


The field inside the sphere is 


E®=0. 


§13 THERMODYNAMIC POTENTIALS. SUSCEPTIBILITY 441 


The surface charge density at the surface of the sphere is equal to 


The total charge of the uncharged sphere obviously remains equal to zero. 


§13.The thermodynamic potentials of a dielectric and the dielectric 
susceptibility 


We have seen in the preceding section that the field inside a dielectric is 
essentially different from the external field, and that the value of the field 
strength inside the dielectric depends on its shape. 

Let a dielectric be placed in an external electric field of strength E}. For 
simplicity we shall restrict ourselves to the case of a uniform and isotropic 
dielectric. From the statistical point of view a dielectric can be considered as 
a quasi-closed subsystem placed in an external field of force. 

The energy of such a subsystem will differ from the energy of the same 
subsystem without a field. The field strength is an external parameter with 
which the energy levels of the system change. 

According to the results of §22 of Part III, the energy of the ith level 
changes with the change of the parameter according to formula (22.1). In 
our case the external parameter is the vector Eg. Hence its change by an 
amount 5E, leads to a change in the energy of the ith level by an amount 


5¢;=— P;ôEp , (13.1) 


where P; represents the generalized force corresponding to the external 
parameter E *: 


P; =—0e,/0E, . 
Since we shall be interested in mean values, then, averaging (13.1), we have 
U= õe; = P; 8E) ==) PiE Ss (13.2) 


* The derivative with respect to the vector has the usual meaning of the abbreviated 
notation of the relations for vector components. 





442 ELECTROSTATICS Ch. 2 


where 
D =P= aU/aE, (13.3) 


and U is the mean energy of the system in the external field (for convenience 
we do not write the averaging bar and simply drop the index /). 

Comparing (13.2) with (17.8) of Part I, it is easily seen that represents 
the mean dipole moment of the entire dielectric. 

From the definition of the mean, we find 


Dap, eT Ae) L aeaEo) T NE) _ 





y 


i De Olej) Die WT Alej) 
ia 
i 
: -eJ/kT dInZ _ OF 
= AP spn Die kT Aep = eT EF aE, *~3E,”’ (13.4) 


where F is the free energy. 

The derivatives with respect to the projections of the external parameter 
Ep are taken for constant values of the temperature and all other external 
parameters characterizing the state of the system. Formula (13.4) shows that 
the mean dipole moment # of the body and its polarization P= P/V can be 
calculated if the partition function of the system and its dependence on the 
strength of the external field are known. On the basis of (13.4) we can 
establish the relation between the change in the free energy with the field, dF, 
and without the field, dF, in the form 


dF = dFy — P-dE,. (13.5) 
Integrating (13.5), we find 
Í F=F,— |P: dE) =F- v [P-dE,=F,—3(P-E,)V. (13:6) 
0 0 0 0 0 2 0 


We have assumed that in a uniform field and uniform dielectric medium 
the polarization vector P~ Eg: This always holds for uniformly polarized 

: bodies (in particular, bodies of ellipsoidal form). 
l The polarization P of the body is connected with the field E by the rela- 
l tion (3.7). Generally speaking, the field E differs from the external field Ep, 


§13 THERMODYNAMIC POTENTIALS. SUSCEPTIBILITY 443 
since the dielectric distorts the external field in the space surrounding it. If, 
however, its dielectric constant € is close to unity, then Ey ~ E and 
P=xkE~kEp, and 

dF = dF, — VP-dE, (13.7) 
or 

F =F, —3kVE2=F, — 3(P-E) V. (13.8) 

As well as the free energy F of a dielectric in an external field one often 

considers the total free energy of the dielectric allowing for its potential 
energy in the external field 

dF" = dF, + VE, -dP=dF, + VE-dP, (13.9) 

BIE VRSE (13.10) 


Sometimes free energy is used for the quantity 





EA EV 
F =F+ = Sirs ra +VP-E,, (13.11) 


which represents the free energy of the dielectric (allowing for its potential 
energy in the external field) plus the energy of the external electric field in 
the volume occupied by the dielectric. 

Making use of (13.8), it can easily be shown that 


dF" = dF} + VE-dP+ + VE: dE = 
4n 


r . (ap+ E) = VE. 
=dF)+VE (P+ JE) = ar, + UE dD. (13.12) 


Thus, F” represents the energy of the dielectric without field plus the energy 


of the field in the volume occupied by the dielectric. 
It follows from (13.12) that 


appa OE (13.13a) 





444 ELECTROSTATICS Ch. 2 


ðE; 4n OZR 
dD, V aDjaDy ` (13.13b) 


In an anisotropic medium the relation between D and E is given by formula 
(6.1) which we shall write in the form 


E.= 


-1 
i” Eik Dx » 


where ee is the tensor inverse to €j- Then (13.13) gives 


V1. 077" m a2F" Vy 


4n “ik  3D;ðD, 9D,9D; 4m “Ki ` 


Thus €,, is a symmetric tensor. 
For practical purposes it is convenient to use the thermodynamic potential 
instead of the free energy. Then instead of (13.8) we obtain 


G=Gy -2(P-E,) V. (13.14) 


The volume of the dielectric in the field is 
_(ac) _, _Fo {ary 
VN Vo c > 
ap} 2 ap |r 
The change in the volume of the dielectric when the external field is 
applied isothermally is called electrostriction.. As can be seen from the last 
formula, the sign and magnitude of the effect depend on the polarizability 
of the body as well as on its compressibility. 


When the field is applied, a change in the entropy of the dielectric occurs 
in addition to the change in the volume. Namely: 


E 
p p 








2 oT 


The change in the entropy when the field is applied isothermally is accom- 
panied by the release of heat Q = TAS (the so-called electro-caloric effect at 
constant pressure). When the field is applied adiabatically, a change in the 
temperature of the dielectric occurs. 

As an example we shall find the electrostriction and electro-caloric effect 
for a sphere in a uniform field. 


§13 THERMODYNAMIC POTENTIALS. SUSCEPTIBILITY 445 


According to (12.9) we have 


— 1(P-E V= Bysus 
G= Go —2(P-E,) O Bre, +2 





2 
VE 

0° 
whence 


72 
Paco ee E Dea (£) es (=) 13.16 
əpjr O” 8re, +20 \dp T 8r (e1+2)2 \ 3p / r` O) 


Analogously 











Q=TAS= 


STEA Ulay STEV de 
0 €1 (2 ) 0 l ( J i (13.17) 
p p 


8n e, + 2\aT 8n (e,+2)2 VOT 


It is easily shown that if e decreases with increasing temperature (as is the 
case for most substances), Q <0, i.e. as the polarization increases heat is 
released. 

The next problem which naturally arises is the calculation of the electric 
susceptibility x of the dielectric. 

In contrast to the preceding relations which have a very general character 
and do not require the knowledge of the actual form of the partition function 
Z, to find the electric susceptibility it is necessary to obtain the partition 
function for the body. 

Let us first consider the electric susceptibility of ideal gases. To find the 
value of the susceptibility per molecule it is necessary to find the correspond- 
ing partition function of the molecule in the electric field. One has to dis- 
tinguish between two cases: 

(a) the molecules possess an intrinsic constant dipole moment, 
(b) the molecules possess no intrinsic dipole moment. 
The dipole moment is defined by the expression 


d= Že; ; 


where the summation is carried out over all charges in the molecule. In atoms 
and molecules of symmetric form positive and negative charges are distributed 
symmetrically. Because of this the summation over the negative and positive 
charges gives zero. All atoms and such symmetric molecules as Hy, O2, CH4 
etc. have no intrinsic dipole moment. On the contrary, for strongly asym- 


e a 


= 


446 ELECTROSTATICS Ch. 2 


metric molecules, for example molecules made up of two different ions, such 
as HCl, HBr and so on, or molecules having an asymmetric form, such as 
CH3Cl, CH3COOH, H20, the dipole moment is different from zero. Such 
molecules are called polar molecules. 

Let us first consider the properties of a gas with polar molecules having 
an intrinsic moment dọ- If e0) is the energy of the molecule in the absence 
of a field, then, when it is placed in an electric field of strength E, its energy 
will be equal to 


e= e0 — dọ - E= € — dy |Elcos8@ , (13.18) 


where @ is the angle between the direction of the applied field and the axis of 
the molecule. 

From formula (13.18) it follows that the change in the energy of a polar 
molecule in an electric field leads to the appearance of a potential energy 
equal to (—dọ |El cos@) for the molecule. This energy in a uniform electric 
field does not depend on the position of the molecule and is determined solely 
by its orientation. The potential energy has a minimum value for a molecule 
oriented along the field, and a maximum value for a molecule oriented in the 
opposite direction. In the absence of a field, molecules are oriented com- 
pletely at random; all orientations are equally probable. The electric field has 
an orienting effect and tends to arrange all the molecular dipoles along the 
field, where their potential energy is a minimum. The orientation of mole- 
cules along the field becomes more probable than against the field. The 
extent to which the field manages to orient all the dipoles is determined by 
the ratio of the energy dy-E acquired by the dipole molecules in the electric 
field to their thermal energy kT. If the latter is large, i.e. if kT >dy-E, 
then the orienting effect of the field is relatively weak. On the contrary, if 
kT <d,-E all the dipoles will be oriented along the field. 

The values of the dipole moments of the molecules of certain gases are 
given in table 1. In order of magnitude they are equal to the product of the 
charge of the electron and the size of the molecule. 


Table 1 





Gas HCl HBr H20 SO, CO CO CHCl}, CH2Cl, CH3Cl 





Dipole moment | 93 979 1.84 1.61 0.00 0.12 0.95 1.59 1.89 
(10 esu-cm) 





§13 THERMODYNAMIC POTENTIALS. SUSCEPTIBILITY 447 


By means of the data of table 1 the order of magnitude of dg-E/KT can 
be estimated. Because of the smallness of the dipole moment dọ it turns out 
that this quantity is very small for all temperatures at which the gases are 
still not condensed and in all practically attainable fields. In order that 
dy:E may be of the order of magnitude of kT it is necessary that |E] be of 
the order of magnitude of kT/dg ~ 1047 V/cm. For T= 300K we obtain 
El=3X 106 V/cm, which is obviously impractical. Thus, the orienting 
effect of the field is weak. Nevertheless, the appearance of a preferential 
orientation of molecular dipoles gives rise to a mean dipole moment ? of 
the entire gas different from zero. 

We shall calculate the mean dipole moment of the gas according to formula 
(13.4). In this case, since the ratio (d)°E)/KT is very small, the temperature 
can practically always be considered as high and the summation can be re- 
placed by integration. The partition function of a gas of dipole molecules in 
an electric field has the form 


a d, |El cos 6 2a N 
0 A 
z=Zo( f exp (“Pep —) sind ao f a) STO) 
0 


0 
Here N is the number of molecules in the gas, and Z, denotes the partition 
function of the gas in the absence of a field. The integration is carried out 
with respect to all states of the molecule, which are determined by the 
angles 0, y. Since d) IEVkKT < 1, we have 


dy IEl cosé 


T 
ff exp (a) sinĝ dð ~ 


0 

us d, lEl d IEl\2 cos2 dgolEl\2 

ES 0 0 cost 0| . = 1/10 ) 

f f urn. cos6 + (“25 ) Dor | sine dé = 2 s kr)? 
0 








so that 


1 (dolE!\2\N 
Z= Z (4r) l: H2) ) 


On taking the logarithm we have 


InZ ~ ln Zg +N in4r +> -= 


N (4 IE|\2 
A ( al : (13.20) 


kad 
= 


reat E 


Sin 


448 ELECTROSTATICS Ch. 2 


Substituting (13.20) into (13.4), we find 


2 
Š Nd IEI 
3kT 





(13.21) 


This formula has a very simple meaning. If it is written in the form of the 
product of two factors 


> doll! 


then it is clear that the first factor characterizes the degree of orientation of 
the molecular dipoles, which is higher the stronger the applied field IE|, and 
lower the stronger the disorienting effect of the thermal motion which is 
characterized by the value of kT. The second factor is the total dipole 
moment of the molecules in the gas. If all the dipoles were oriented along the 
field, the resulting polarization of the gas would be equal to dọN. In the 
presence of thermal motion the polarization represents only a small fraction 
of this limiting polarization. One can picture the smallness of the orientation 
factor dy IEVkT in an obvious way as follows. Most of the polar molecules 
are moving and, in particular, are rotating in the gas with an energy which is 
considerably larger than the potential energy in the field d)£. Hence the 
applied field cannot stop their rotation. As the molecules are rotating the 
dipole moment is averaged over all directions, and the mean contribution of 
a rotating molecule to the total dipole moment of the gas turns out to be 
equal to zero. However, there is in the gas a small percentage of molecules 
for which the kinetic energy of rotary motion is smaller than dg-E. Such 
molecules cannot be oriented against the field, and in the field their rotary 
motion is replaced by torsional oscillations about the direction of the field. 
Itis only these molecules which contribute to the mean dipole moment of the 
gas. 

From (13.21) one can find the electric susceptibility of a polar gas related 
to unit volume: 


P Nd? 
k = TEV 3VkT 5 (13.22) 


Formula (13.22) is called the Langevin formula, since it was first obtained by 
Langevin. 


§13 THERMODYNAMIC POTENTIALS. SUSCEPTIBILITY 449 


The electric susceptibility of the gas is connected with the directly 
measured dielectric constant by the relation (4.14): 


€=1+47k. 


From the measurements of e as a function of the temperature one can find 
the dipole moment dg of the molecules. The values of dọ found in such a 
way are given in table 1. Their numerical value is small in comparison with 
unity, so that in gases € is always of the order of unity. 

The dipole moment is an important characteristic of molecules. In parti- 
cular, it allows one to estimate the structure and geometrical form of the 
molecules. The more asymmetric the molecule, the larger is its dipole 
moment. This is seen, in particular, in the example of the molecules CCly, 
CHCl3, CHCl, CH3Cl and CH4, whose susceptibilities are shown in fig. 
IV.S. The value of dp is determined from the slope of the corresponding 
straight line. 

Carbon tetrachloride and methane are symmetric molecules having the 
form of a tetrahedron; their dipole moments are nearly equal to zero. The 
slope of the corresponding straight line is connected with the induced dipole 
moment. For the chlorine derivatives of methane, which represent asym- 
metric molecules, the dipole moments differ from zero. The largest value 
occurs for the asymmetric molecule CH4 C1. 

The behaviour of molecules which possess no intrinsic dipole moment 
is of less interest. Under the action of an applied electric field they become 
polarized. The electron shells of the atoms or molecules are displaced with 





10 
CHCl 
8 
M 
o6 
Š a eo a See 
SJ 
4 
2 E E OE 
——oo 0 —_o—___. CCl4 
a CH4 
momin 
Clear a) 
0.0020 0.0025 0.0030 0.0035 


1/7 


Fig. IV.5 


= 





450 ELECTROSTATICS Ch. 2 
respect to the nuclei, and an induced dipole moment arises in them. The 
value of the induced dipole moment is proportional to the strength of the 
applied field, so that 

d=aE 


and the energy is 





e= e0 — SalE? . (13.23) 
Hence 
~ ule) 
Z= Z9 exp ( kT 
and 
P=NalEl. (13.24) 


The polarization turns out to be independent of the temperature. In other 
words, the thermal motion has no effect on the polarization of the electron 
shells of atoms and molecules. 

The theory given here is valid only for very rarefied gases. For more dense 
gases and particularly for liquids the interaction between molecules plays an 
essential role in their electric properties. In such systems one can no longer 
consider the orienting effect of the field on an individual dipole. The orienta- 
tion of the dipoles will be determined not only by the external but also by 
the internal electric field which is produced by all the polar molecules. 

The theory of polar liquids turns out to be very complex, and we cannot 
develop it here. We note only that in this field things are not as clear as in the 
case of rarefied gases. 

The dielectric polarization of most solid dielectrics is due to the induced 
polarization of the molecules of the dielectric. Even in crystals made up of 
polar molecules no orientation polarization occurs, since the molecules in the 
crystal interact strongly with their neighbours and the electric field is too 
weak to overcome the forces of interaction and to turn the molecules. One 
of a few typical exceptions is the solid HCl. The measurements of I.V. 
Kurchatov have shown that the molecules of HCI are turned under the action 
of an electric field and an orientation polarization takes place in the crystal. 
The values of the susceptibility and dielectric constant for solid dielectrics are 


§13 THERMODYNAMIC POTENTIALS. SUSCEPTIBILITY 451 


much larger than for gases. The dielectric constants of solid dielectrics have 
values several times unity, and in a number of cases reach as much as 100. 

In view of the large value of the dielectric constant in a dense medium it 
is necessary to take into account that a molecule in the medium is acted 
upon not by the external field Eg but by the effective internal field made 
up of the external field and the field which arises in the medium because of 
the polarization of its molecules. 

The dielectric constant of solid dielectrics varies relatively weakly with the 
temperature. This means that the dielectric properties of solid bodies are 
associated with the change of the charge distribution inside the molecules 
but do not depend on their thermal motion. 





Direct Electric Current 


and the Magnetic Properties of Matter 


§14. Ohm’s law 


Having considered the basic properties of the electrostatic field of charges 
at rest, we can pass on to the study of the more complex case of the electro- 
magnetic field arising when there is a steady motion of free charges, i.e. in the 
presence of a current in conductors constant in time. This problem has two 
aspects which are to a considerable degree independent: 

(1) to find the electromagnetic field of direct currents, 
(2) the consideration of the mechanism of passage of the current in different 
media, i.e. the mechanism of electrical conductivity. 

Without going into a study of the mechanism of electrical conductivity, 
we shall restrict ourselves to the assumption that the current density in a 
homogeneous conductor is related to the field strength by Ohm’s law (j=cE). 

The value of the electrical conductivity ø is closely associated with the 
mechanism of the passage of the current and varies over a very wide range for 
different conductors. 

In Part V we shall consider the microscopic meaning of the electrical 
conductivity and shall estimate its value for some important conductors. 

We shall further assume that the constancy of the electric current in time 
is maintained by devices called current sources. Examples of such sources are 


452 


§14 OHM’S LAW 453 


constant-current generators of various types, i.e. galvanic cells, accumulators, 
thermocouples etc. Indeed, it is physically obvious that no combination of 
charged or neutral conductors can ensure the passage of a constant current 
in a system. By bringing conductors at different potentials into contact we 
can cause a transient motion of free charges which will continue until the 
potentials of all the conductors become the same. 

The sources of a constant electric current must always have a non-elec- 
trostatic character. They can, for example, have an electrochemical character. 
In what follows we shall not go into details of the current sources. For us, 
only the fact that the current sources ensure the maintainance of a constant 
current is of importance. 

Formally, without considering the mechanism of the current sources, we 
can include them in the composition of the system of conductors considered 
by changing the form of Ohm’s law. That is, observing that the current 
sources produce a current in the conductors independent of the direct action 
of the electric field, we write Ohm’s law in the generalized form 


j=o(E+E™P). (14.1) 


The vector E™P (where iMP stands for “impressed”), which depends on 
the coordinates, formally characterizes the action of the current sources. In 
those parts of the conductor where the current source acts, for E= 0, j #0. 
Thus, for example, if the system is made up of a conductor connected to the 
plates of a galvanic cell, then in the region of space occupied by the cell 
Emp +0. The processes occurring inside the galvanic cell make it possible 
to maintain its plates, which are connected to the conductor, at different 
potentials. This in its turn ensures the existence of an electric field constant 
in time inside the cell and, in correspondence with Ohm’s law, the existence 
of a constant current in the conductor. 

The quantity E™P is for historical reasons called, not quite correctly, the 
impressed electromotive force (E!™P does not have the dimensions of a force 
and is not, in essence, the analogue of a force). The electromotive force is a 
quantitative characteristic of the device which maintains the passage of a 
constant current in conductors. We shall return to the interpretation of the 
notation of the electromotive force in later sections. 

Maxwell’s equations for a constant current in the presence of electromotive 
forces has the form 


VX E=0, 
(14.2) 
V-D =47p; 


an! 
lf 


o e aa 
A 


454 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


vx H=4j, 
£ (14.3) 
TaB = Oa | 


The equation of continuity for the passage of steady currents can be 
written in the form 


V-j=0. (14.4) 


It is easily seen that the system of equations written above is complete. 
We see, first of all, that the electric field distribution does not depend on the 
magnetic field distribution. The latter is determined by defining the current 
density j over all space. 

At the interface of conducting media the following boundary conditions 
must be fulfilled: 


BORED, (14.5) 
or 
jj 
2-8 (14.6) 
oD o2 
and 
jD =j. (14.7) 


The first of these is the same as (5.2), and the second is obtained from the 
equation of continuity in the same way as, for example, the condition (5.5). 

The boundary conditions (14.6) — (14.7) can be interpreted in an obvious 
way as the refraction of lines of flow at the interface according to the law 


tga, gl) 
Ga CO 





(14.8) 


where & is the angle between the line of flow and the normal to the surface in 
the corresponding media. 

At the interface conductor—dielectric the following boundary condition is 
satisfied: 


§15 LINEAR CONDUCTOR CARRYING CONSTANT CURRENT 455 
Ik =) - (14.9) 


Magnetic vectors satisfy the boundary conditions which have been con- 
sidered in §5. 


§ 15. A linear conductor carrying a constant current 


We shall first consider the very important case of a linear conductor with 
constant current. By a linear conductor we shall mean one whose length is 
very large compared with its transverse dimensions. Linear conductors are 
also often called wires. The current density vector in a linear conductor can, 
by virtue of the boundary condition (14.9) at its surface, be considered with 
a high degree of accuracy to be parallel to the vector dl which is tangential 
to the axis of the conductor. Thus, for each point of a linear conductor one 
can write 


jdl=jd. (15.1) 


We bring into consideration the total current / flowing through the cross- 
section of the linear conductor normal to the axis of the conductor (or, what 
is the same, normal to the flow lines). By definition 


r= fj : dS= f o(E+E™?) -dS, (15.2) 


where the integration is carried out over a cross-section of the contour carry- 
ing the current. 


From the equation of continuity (14.4) and the boundary condition one 
can write 


fi- as=0, 
or 

1= fj- dS=jS= const, (15.3) 
where S is the cross-section of the conductor at a given position. 


The equation of continuity in integral form shows that the same current / 
flows through any section of the linear conductor. 





456 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


We integrate the formula of the generalized Ohm’s law (14.1) along the 
conductor carrying the current. We have 


5 


2 
Wise fa f Emp - dl. 
1 1 


1 


We transform the first integral, writing 


2 2 2 
wali a ja adis t di _ 
Nie os prosi (13.4) 


where R}3 represents the ohmic resistance of the conductor along the segment 
(1, 2). Then we have 





IR =p — 92+ EP, (15.5) 


where y} — y is the potential difference between the points 1 and 2, and 


2 


S E E (15.6) 
1 


is called the impressed electromotive force (e.m.f.) over the segment (1, 2). 
If there is no electromotive force in a given segment of the conductor, i.e. 
E™P = 0 then (15.5) goes over into the simpie Ohm’s law. 
If the path is closed and the points 1 and 2 coincide. since by virtue of 
(14.2) the field E has the character of potential, the integral f E-dl=Oand 


IR = CMP, (15.7) 
where R is the resistance of the entire linear contour and 
cimp = f Eimp - dl. (15.8) 


The product of the current and the resistance of the current carrying 
filament is equal to the e.m.f. round the closed circuit. 

Let us now consider the energy relations for a current carrying filament. 

As we have stressed, the entire work done by the current in a direct 
current circuit goes into heat. Hence the total heat released in the linear 
conductor is 


§16 DIRECT CURRENT IN CONDUCTING MEDIUM 457 
Q= I av= fj- (E+E) av = 
o 


=—[j-Voav+ fj- Eme ay= 
= fj- Emp av- (Vig av+ f AVi) dV= 
= fj- Emp av — f jn dS = fj- Em av 

by virtue of (14.2), (14.4) and (14.9). 


The total heat released in the circuit turns out to be equal to the work 
done by the electromotive forces. 


§ 16. Direct current in a conducting medium 


Another limiting case is the passage of current in a system of good con- 
ductors (for example, metal electrodes) immersed into a conducting medium. 
If it is assumed that electromotive forces are absent in the conducting 
medium, the equations for the electric field can be written in the form 


VX E=0O, 
V-j= V-oE=0. 


In a homogeneous medium, for ø = const, the last expression assumes the 
form 


Introducing the field potential y, we find that it satisfies the equation 
V2p=0. (16.1) 


At the surface of the conductors the boundary conditions (14.5) and 
(14.7) are satisfied. They can be written in terms of the potential y. The 
condition (14.5) for the continuity of the tangential component of the field 
goes over directly into the condition of the continuity of the potentia! at the 
interface: 








458 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ghes 
91 =92- (16.2) 


The equality of the normal components of the current density j} = o£, gives 


ae (23) 
(){ 2?) =| 
o (22 s o an}? (16.3) 


where n is the normal to the interface. 

We see that the equation for the potential and the whole set of boundary 
conditions, which determine the current distribution in a conducting medium, 
are identical with the corresponding expressions of §8 which determine the 
electrostatic field distribution in two dielectric media. The only difference 
lies in the fact that the electric conductivities oł) and o) of the media 
appear in the boundary condition (16.3) instead of the dielectric constants 
€ and €. 

Hence the potential in a conducting medium is determined by the formulae 
of electrostatics with the substitution of o for e. 

Let us consider the case of two electrodes in an infinite medium. We write 
the current flowing from an electrode in the form 


1=fj-dS=f7, ds = of (54) as 


` Introducing the electrostatic capacitance according to formula (8.13), we 


find 


_ 410 oy _ 4mo 
€l aey (32). as | E€] Coy i 
The resistance is equal to 


My) A St al 


T i eee Coa 





RE 


This last formula allows one to express the resistance of the system formally 
in terms of the capacities of an electrostatic system of conductors with 
analogous geometric characteristics. 

As an example a system of two spherical electrodes immersed into an 
infinite medium can be considered. We assume that the radii of the electrodes 
a and bare small in comparison to the distance between their centres. 


§ 16 DIRECT CURRENT IN CONDUCTING MEDIUM 459 
The solution of Laplace’s equation for the two spheres has the form 


R ZIO ae 
p re a 


where p, and ọ, are the potentials at the surfaces of the spheres, and ry and 


ry are the distances to the centres of the spheres. 
The total current from the surface of the first sphere is equal to 


= oy y 
I = fo Ea dS ~ 4raoy, , 


if the second term of y, which for r=a is small in comparison with the first 
term, is disregarded. Analogously, the total current to the second sphere 
turns out to be equal to 

I=—4nbog, . 


The resistance is 


EMD A (: L) 
Ra I ~ 4n0 ab i 





Thus the solution of the problem of the spatial current distribution reduces to 
the solution of the corresponding electrostatic problem. 

However, an important reservation should be made. If the conductor is 
partially contiguous to a conducting medium and partially to a non-conduct- 
ing medium, the formal analogy with electrostatics is meaningless. Indeed, in 
a non-conducting medium o = 0, whereas in electrostatics there are no bodies 
with a dielectric constant equal to zero. 

We shall find one more useful relation between the current density and 
total current in a conductor in the general case. Let the current density flow- 
ing along a conductor be distributed non-uniformly over the cross-section. 
We divide the conductor into arbitrarily fine current tubes. The solenoidal 
character of direct current always allows one to make such a division. Each 
current tube can be considered a linear conductor, and we can write 


di =ja ` dSa, (16.5) 


where the index @ denotes the ordinal number of the tube. 


460 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


This equation has a simple meaning: although the current density distri- 
bution is determined by the physical properties of the conductor and by its 
geometry, the current density is proportional to the total current, other 
things being equal. 

If the current density were distributed uniformly over the cross-section, 
one could obviously write 


Ch E 


7p ae (16.6) 


where / and S are the total current and the cross-section of the conductor 
respectively. For a non-uniform current distribution one can always assume 
that 


W=const (along the length of the conductor) , (16.7) 


where the function y characterizes the non-uniformity of the current distri- 
bution. Hence for any current distribution we have 


cre 
j=lgl, (16.8) 


where | is the unit vector directed along the flow line. 


§ 17. The magnetic fields of direct currents. The Biot—Savart law 
Knowing the current density distribution one can find the magnetic field 


distribution by integrating eqs. (14.3). Introducing the vector potential A 
into (14.3), according to formula (4.19) we obtain the equation 


Jt 4m. 
vx(4vxa)=43), (17.1) 


For an infinite homogeneous and isotropic medium characterized by a 
constant magnetic permeability u one can write 


V2A=— StH). (17.2) 


§17 MAGNETIC FIELDS OF DIRECT CURRENTS 461 


The solution of this last equation 


_ur.dV 
A== fiS (17.3) 


differs only by the factor u from the solution of eq. (19.13) of Part I for the 
vector potential in vacuum. 

Since outside the conductors the current density j is equal to zero, the 
integration can actually be carried out only with respect to the volume of the 
conductors. However, we assume that has one and the same value both in 
the conductors and in the medium which surrounds them. If the system con- 
tains no ferromagnetic bodies, then the actual value of u differs little from 
unity. Hence u can be assumed to have approximately the same value over all 
space. In the presence of ferromagnetics the formula (17.3) becomes meaning- 
less. 

For linear conductors one can essentially simplify formula (17.3), writing, 
by virtue of the constancy of the total current / over the cross-section of the 
linear conductor, 


pot fj apia ara 


In formula (17.4) u denotes the magnetic permeability of the medium, which 
is external to the conductor carrying the current. We see that all character- 
istics of the linear conductor itself, for example the current density distribu- 
tion in it, are eliminated in (17.4), and the conductor is considered as a 
purely geometric object. Hence the properties of the linear conductor, includ- 
ing also the magnetic properties of the material of which the wire is made, do 
not affect the value of the magnetic field. Formula (17.4) is valid for all 
conductors, including ferromagnetic ones. 

From the definition of the vector potential it follows that the magnetic 


induction of a linear conductor carrying a current J is equal to (see (19.15) 
of Part I) 


JO LET eee (UE E 
B= Vx A= vx [S-E fyx© 








=H fovrtyx a= f EE, (175) 


r 


462 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


Formula (17.5) expresses the Biot—Savart law for a homogeneous and iso- 
tropic medium. The induction (mean field) B turns out to be larger by a 
factor of than the field of the same current in vacuum. Correspondingly 
the field strength in the medium H=y7!B is the same as the field in a 


vacuum. 
The Biot—Savart law is often written in differential form 





apa Xe (17.6) 
r 


where dB is the contribution to the induction of the current element dl. It 
should be kept in mind that (17.5) cannot be resolved unambiguously into 
the elements (17.6). One can always add to (17.6) a vector function which 
reduces to zero in integrating over a closed contour carrying a direct current. 

As an example of the application of the formulae obtained, we shall cal- 
culate the magnetic field produced in the surrounding space by a current 
flowing in an infinite straight linear conductor. 

From considerations of symmetry it is clear that the field is directed along 
tangents to circles which are concentric about the conductor. The Biot— 
Savart law applied to the case considered gives 


_wrdxn 
Be pie 





where n= r/r. Introducing the shortest distance p from the point of observa- 
tion to the conductor, we have for the field component B, directed along a 
tangent to the circle of radius p concentric with respect to the conductor 





431 
-HI (cos? a (ono SS 7 = Zul 
By S az Sin(90"—a) dl =e J osad, (17.7) 
-71 

where we have set (fig. IV.6) 

rcosa=p, 

sin (90°—a) dl =r da =£ dag 

cosa 


Further, we calculate the magnetic field on the central axis, z, perpendicular 
to the plane of a flat circular current carrying conductor of radius a with 
centre on the axis z (fig. [V.7). In this case one can write for the field com- 


ponent 


§17 MAGNETIC FIELDS OF DIRECT CURRENTS 463 


dl 














Fig. 1V.6 Fig. 1V.7 =- 





dix i DRI 
B =i gl rlz a ig cision ce peos asina ora. 
ZTO p3 c i 


cz? 
Since (fig. IV.7) 


a : iL 
cosa=z(z2+a2)2 , sina = a(z2+a2)2 , 


we find finally 


= 2H So , (17.8) 
c (z2+a2)2 
where S is the area of the circle formed by the current. 
In particular, at a large distance from the conductor (z>a) 
2uM. 
p, -48n 5 (17.9) 
cz Z 


where the projection M, of the magnetic moment of the current is equal to 
IS/c (see §22 of Part I). 
At the centre of the circle in the plane z = 0 


piee (17.10) 
ca? 

The field component B, perpendicular to the z-axis is equal to zero, as is 

clear from considerations of symmetry: to opposite segments of the current 

carrying conductor there will correspond values of B; with opposite signs. 


Tee ee scene 


464 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


Finding the magnetic field in non-linear conductors carrying direct currents 
presents a problem which is complex from the mathematical standpoint. The 
calculation of the vector potential according to the general formula (17.3) can 
be carried out only for individual cases. We shall therefore confine ourselves 


to examples. 

A very simple example is the magnetic field of an.infinite current carrying 
linear conductor of circular cross-section of radius a. In this case the integra- 
tion of eq. (14.3) can be carried out directly. Owing to the cylindrical sym- 
metry the magnetic field inside as well as outside the conductor has only one 
component, H,,. The integration of (14.3) over the area of a circle of a radius 


p <a gives 
24n., 
2npHy = E jS, 
where S = mp?. This last formula can be written in the form 
fete -2 2 (p<a) , (17.11) 


or, in the vector form, 


2 


RA Xp] = a (IXp] , (17.12) 


H= 
where l is a unit vector along the axis of the cylinder. Analogously, outside 
the conductor 


a 2! 


Y cp (p>a) . (17.13) 


If the cross-section of the conductor is more complex, then the principle 
of superposition of fields sometimes proves useful. 


§18. The magnetization of magnetic materials and the magnetic monient 


We now pass on to the consideration of the magnetic properties of matter. 

If a system of particles is placed in an external magnetic field, then the 
system will be magnetized. The mean magnetic moment of the system can 
be determined by the same relation as the mean dipole moment: 


§18 MAGNETIZATION. MAGNETIC MOMENT 465 


Spee es OF 
M= kT >H" Z=- 3R” 


(18.1) 
where His the strength of the external magnetic field. 

Correspondingly one can write the expression for the change in the free 
energy: 


dF = dF) — MV-dH. 


One can also introduce the Gibbs thermodynamic potential 
dG = dGy — MV -dH (18.2) 


Their meanings are analogous to those of the corresponding quantities for the 
electric field. 

We shall not dwell on the analysis of the phenomena of magnetostriction 
and the magnetocaloric effect. They are also analogous to the electric effects 
which have been considered in § 13. 

The magnetic susceptibility can be calculated according to the formula 


IMI kT 1 əlnZ 


x= H] V iM of” (18.3) 





Depending on the sign of x one distinguishes between diamagnetic (x<0) and 
paramagnetic (x>0) substances. In addition to diamagnetic and paramagnetic 
substances there is a special group of ferromagnetic bodies, for which the 
magnetic susceptibility is extremely large and depends strongly on the mag- 
netic field. 

For diamagnetic substances the magnetic susceptibility is usually very 
small in absolute value (x~10-®) per gram-mole and does not depend on the 
temperature. All inert gases, many gases whose molecules represent saturated 
chemical compounds, almost all organic compounds, all simple insulators and 
about one half of all metals (Cu, Ag, Au, Hg, Zn for instance) are diamag- 
netic. Among the latter one encounters the so-called anomalous diamagnetics, 
for which the susceptibility exceeds the normal value quoted above by a 
factor of 10—100 and which have a number of other anomalous properties 
(for example, x depends on the temperature and on the field, as in the case of 
Bi and Sb). 


bh 
i li 


| i i 








466 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


For normal paramagnetics the magnetic susceptibility depends on the 
temperature according to the law 


_ const 


T 


In order of magnitude x is about 1074—1076 per gram-mole. Among such 
paramagnetic substances there are certain gases (Oo, NO, CO, for instance), 
the crystalline hydrates of rare earth salts (for example, Gd,SO4 8H 0), the 
salts of the metals of the group of platinum, iron etc. For many normal 
paramagnetics the dependence of x on the temperature has the form 


— const 
XT+A’ 


where A is a constant. Another group of paramagnetic substances is made up 
of paramagnetic metals possessing a small paramagnetic susceptibility 
(x= 10-§—10-7 per gram-mole) which does not depend on the temperature. 
There also exist the so-called anomalous paramagnetics, for which the para- 
magnetic susceptibility depends on the field (metamagnetics) or has a maxi- 
mum value at a certain temperature (antiferromagnetics). 

Finally, substances with a very large positive susceptibility (in order of 
magnitude reaching 10°) which depends in a complex way on the magnetic 
field strength and on the temperature and a number of other factors make up 
the group of ferromagnetics. 

We cannot here discuss in detail all aspects of the modern theory of the 
magnetic properties of matter. We shall confine ourselves only to certain 
general observations and to the exposition of the theory of the paramagnetism 
of normal paramagnetics. The theory of the diamagnetic properties of atoms 
will be substantiated in ch. 9 of Part V. 

Before passing on to the exposition of the modem theory of the magnetic 
properties of matter it is necessary to dwell briefly on a statement which 
seems at first sight to be paradoxical. Namely, one can in the most general 
form prove that the magnetic moment of any body calculated by means of 
the laws of classical statistics is identically equal to zero. We shall present 
the simplest proof of this theorem. 

Any system in an external field can be looked on as a set of moving 
charged particles. As is known from electrodynamics, when a charged particle 
is moving in a uniform magnetic field directed along the z-axis the generalized 


momentum of the particle has the form (§41 of Part 1) 


§18 MAGNETIZATION. MAGNETIC MOMENT 467 


250) Gh = »(0) , ex = (0) 
Bam Beran ty ply By u* pads oe 


where po; po, po are the components of the momentum in the absence 
of field.” 


The Hamilton function is given by the expression (41.8) of Part I: 


be [px +(eHy/2c)] ? + [p, —(eHx/2c)|? + p? 
2m > : 





The partition function of the system in the magnetic field has the form 


dpdy |N 
= cnr apar) ; 
[s h3 


Introducing instead of p, a new variable 


apk (1Y eH 
NO ee 


we see that upon integration with respect to p} from minus infinity to plus 
infinity the partition function turns out to be independent of the external 
field H. The same holds also for Py- By virtue of (18.3) the mean magnetic 
moment is identically equal to zero. 

This seems to be particularly paradoxical because most books give the 
classical interpretation of diamagnetism and paramagnetism. The diamag- 
netic properties of matter are associated with the change in the orbital 
motion of electrons in the atom caused by the magnetic field. As is well 
known, the magnetic field induces a current flowing in a closed path in such 
a direction that the additional magnetic field of the current weakens the 
applied field. The induced magnetic moment of the current is directed 
against the field and is proportional to the strength of the latter as well as 
to the area embraced by the contour. Assuming the electron moving in the 
atom to be a current filament, one can obtain the following purely electro- 
dynamic expression for the magnetic moment of the particle moving in an 
orbit of radius rọ: 


= e? 2 


=— ae roH $ 








468 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


As to paramagnetism, it is associated in classical electrodynamics with the 
presence of a magnetic moment of the electron moving in orbit and having 
an angular momentum different from zero. If L denotes the angular mo- 
mentum of the system, then in classical electrodynamics it turns out that the 
system possesses the magnetic moment 


me pepe (18.4) 


The magnetic moment defined by formula (18.4) is the analogue of the 
electric dipole moment. Every atom is a sort of a small magnet. Hence the 
above considerations and formula (18.1) are completely applicable to it. In 
an atomic gas whose atoms possess magnetic moment u there must arise a 
mean magnetic moment owing to the appearance of the preferential orienta- 
tion of magnetic moments along the field. 

The inconsistency of these arguments lies in the fact that one assumes 
beforehand the existence of stable electron orbits. However, it is well known 
that the existence of stable orbits cannot be conceived on the basis of classical 
concepts. The absence of a magnetic moment in classical physics is an ex- 
pression of the fact of the absence of stable motion in a system of elementary 
charges. The assumption of the existence of stationary electron orbits in 
atoms or fixed moments of atoms, which is used in the “classical”? theory of 
magnetism, represents in essence the assumption of the quantization of states, 
in an implicit form. 

It is, proved in quantum mechanics that the energy levels of an atomic 
system change in a magnetic field. In the case of atoms or ions, for which 
the mean value of the angular momentum L of the system is equal to zero, 
for the energy of the ground state one obtains the following expression (see 
ch. 9 of Part V): 


AG co tome ei On (18.5) 


where eg is the value of the energy in the ground state, and ton is the 
quantum-mechanical mean of the radius vector of the ith electron in the 
ground state. The summation is carried out over all electrons in the atom. To 
this energy there corresponds a mean magnetic moment 


= = r2 
Hmean =- 3H ot. DG; Jem ° (18.6) 


§18 MAGNETIZATION. MAGNETIC MOMENT 469 


Formula (18.6) is in form the same as the classical formula given above, but 
has a different meaning: the quantity (4) men represents the quantum- 
mechanical mean (see ch. 3 of Part V). 


The atom or ion in the ground state has a diamagnetic susceptibility 


Xo i HEDE D mean ~ (18.7) 


E eme 


We stress that in (18.7) no averaging over different states is carried out, and 
the averaging has nothing in common with statistical averaging. 

A system representing a set of N independent atoms or ions will, in the 
ground state, possess an induced magnetic moment 


Ri Nee De snes 


6mc? 


and a diamagnetic susceptibility 


s= 2 na raan (18.8) 


ae 


If one substitutes values for (77) mean Calculated by different methods (for 
these methods see Part V) or the mean radii of atoms obtained from the 
kinetic theory of gases, then one obtains for x a value which agrees with 
those found from measurements. 

If a system (an atom or an ion) in the ground state possesses an angular 
momentum different from zero, then the energy in a magnetic field will have 
a different form. 

It turns out that in quantum mechanics, if a molecular system possesses 
an orbital angular momentum L, then its energy in a state i is equal to 








p (0) _ eh 
Sisi me © ae DA Diem © (18.9) 


where L, is the projection of the angular momentum onto the direction of 
the TERE field (the z-axis is chosen to be along the field). For the deriva- 
tion of (18.9) see ch. 9 of Part V. 

Simple estimates show that the last term of formula (18.9) for all values of 
the field strength H is small in comparison with the scone term. Exceptions 


are for very large organic molecules, for which 2 (r?) mean ÍS Very large. 











470 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


In what follows we shall drop the last term of (18.9) and write the energy 
in the form 
(0 eh 
GG) ed (18.10) 
It is shown in quantum mechanics (see ch. 3 of Part V) that the projection 
of the angular momentum onto the z-axis takes on discrete values: 


Tee OL 1. LI, Lb 


(all together 2L+1 values). Hence in a magnetic field the ith energy level 
splits into 2L+1 levels which possess the energies 


eh 


EROR 
EE 4nmce 


BHS 





O aa Na 
Gi t Game (C= oyr 


w __eh 
Ei = ome LH 
The presence of an intrinsic angular momentum (spin) as well as the 

angular momentum due to the orbital motion in a system of electrons leads to 
the appearance of a spin magnetic moment in the system. If the resulting 
spin S of the system is different from zero, while the resulting orbital 
momentum L is equal to zero, the energy of the system in a magnetic field 
turns out to be equal to (see ch. 8 of Part V) 





a eh 
OG aera SHER (18.11) 


where S, is the projection of the intrinsic angular momentum onto the direc- 
tion of the field taking on a discrete series of values: 


Se SSP E Sey hel oS; sodas 


For such a system the ith level splits into 2S+1 sublevels: 


§19 PARAMAGNETIC SUSCEPTIBILITY 471 


EO) wel oye keng 
e; = e9) +—— a RENES 
et 2nme elr i 2nmce 


(18.12) 
We shall not consider the general case where L and § simultaneously differ 
from zero. 

In a magnetic field the system in the /th state will possess a mean (in the 
quantum-mechanical sense) magnetic moment 








ðE; eh 
mean = — 3H = Same a” (18-13) 
or 
ðE; eh 
(Hi)mean = -3H 2nme°2 ° geal) 


We see that the magnetic moment of the system has a positive sign, i.e. 
that the system having an intrinsic angular momentum is paramagnetic. The 
ratio of the magnetic moment to the orbital momentum is equal to eh/4nmce. 
For the spin moment this ratio is twice as large. 


§ 19. Paramagnetic susceptibility 


Let us now consider the behaviour of a system containing a large number 
of atoms or molecules in an external magnetic field. The magnetic field has 
an orienting effect on the magnetic moments of the atoms, tending to set 
them along the field. The thermal motion disturbs such a regular arrange- 
ment of the moments. As a result of the competition of these processes a 
certain mean arrangement of the directions of the magnetic moments with 
respect to the direction of the field is established. To this mean arrangement 
of the elementary magnetic moments there corresponds the mean magnetic 
moment of the entire system. 

Let us find the resulting magnetic moment of a system of atoms accord- 
ing to the general formula. We assume that any interaction between the mag- 
netic moments is absent and each magnetic moment turns freely in the ex- 
ternal field. In order that this assumption may be valid it is necessary that 
the mean distance between the atom is sufficiently large. Examples of such 
systems will be given below. If the initial assumption is fulfilled, then each 
particle can be considered as an individual subsystem having an energy in the 


472 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


field given by formula (18.9). The mean value of the magnetic moment of 
the particle can be found by means of formula (18.1). 
For the partition function we have 


e% _(eh/4nmc)L H 
z= 3 exp E | 26). (19.1) 


The summation is to be carried out over all values of the energy of the sub- 
system. Since in the magnetic field 2L + 1 adjacent energy levels arise in 
place of one energy level, the summation in (19.1) is carried out over all 
energy levels i and, within the limits of a given level, also over all sub-levels 
which are defined by formula (18.9) and which differ from each other by 
the discrete values of the magnetic energy ehL,H/4mmc, since L, takes on 
a discrete series of values. The expression (19.1) can essentially be simplified 
if it is taken into account that in atomic systems the spacing between levels 
is very large in comparison with the thermal energy kT. Because of this the 
terms of the sum will decrease rapidly and one can confine oneself to the 
first term which refers to the ground state energy €g. In the magnetic field 
the latter level splits into 2L + 1 or 2S +1 sub-levels, depending on which 
of the quantities, L or S, differs from zero in the ground state. In the first 
case one can write 


z= exp [-3] E exl eh =] Uez) 5 (19.2) 


IE 4nmc kT 








where the sum is taken over the sub-levels mentioned. 
In what follows we shall consider two limiting cases: 





eh 
Game Lz” AT (19.3) 
and 
eh 
TERTA L,H<kT. (19.4) 


Because of the smallness of the magnetic moment, the condition (19.4) is 
fulfilled for available fields at a not too low temperature. If condition (19.4) 
is fulfilled, then the exponent in (19.2) can be expanded in a series and one 
can restrict oneself to terms which are of low order in the field: 


§19 PARAMAGNETIC SUSCEPTIBILITY 473 
Ze LH 
> 5 2 = 2 
= exp [- z] Keg) Z, (1+ K San (a) ( KT ) K a) i 
Upon summing, we have 


DOTAT 





N 


DIL, =0, 


27 L2=5L(L+1)(2L+1). 


We then obtain 


z= exp Ear ern eo ii (e aa], (19.5) 


6 \ 4mmc (kT)2 


The partition function of an entire system consisting of mutually inde- 
pendent particles is obviously equal to Z = 2^, 
According to (18.1) the mean magnetic moment of the entire system is 








nA dInz_1/_eh_\? L(L+1) NH 
onl dH hae dH 3 G5 kT ; C96) 
The paramagnetic susceptibility referred to N molecules has the form 
=M_1(_eh _\? L(L+1)N 
ian Ban's eS) ae om 


If the magnetic moment is due not to the orbital momentum but to the 
spin, the instead of (19.6) we have by virtue of (18.14) 


-1 (_eh_)\? S(S+1) NH 
M 3 (4) kT $ (CEH) 


while instead of (19.7) we obtain 





474 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


_1(_eh_)? S(St1)N 
x= 3 (=) kT 7 (19.9) 


If, finally, the system has both orbital momentum and spin, then one ob- 
tains for x an analogous expression but with the coefficient in the numerator 
having a value which is intermediate between (19.7) and (19.9). 

In all the cases considered the paramagnetic susceptibility turns out to be 
inversely proportional to the temperature 7. Besides the temperature, the 
magnetic susceptibilities (19.7) and (19.9) contain only a constant factor 
made up of universal constants and the value of the orbital momentum or 
spin respectively. 

Passing on to the second limiting case where the condition (19.3) is 
satisfied, we see that of all the terms of the sum in (19.1) it is necessary to 
retain only one term corresponding to the value of the magnetic part of the 
energy ehLH/4mmc. Other terms of the sum which contain the components 
eh(L—1)H/4nmc, eh(L—2)H/4nmme and so on will be much smaller than the 
first. The reason for this is clear: the inequality (19.3) means that the energy 
of orientation in the magnetic field is large in comparison with the thermal 
energy and that all magnetic moments will be oriented along the field, so that 
their projection L, will be equal to ZL, i.e. a total saturation will ensue. 

Dropping all terms except the first in the sum (19.2), we find 














€ 
h £9 eh LH 
fides l-e] ae Ee A d (CH) 
whence 
5 ðlnz __ehN 
M NKT ðH  4nmc CSa) 
or 
_ ehN 
Moe SH (19.12) 


(depending on the nature of the magnetic moment). Thus total saturation 
ensues, and all moments are set up along the field. In the intermediate case 
one can obtain a general formula for the dependence of the magnetic moment 
on the field. 

Considering now comparison with experiment, we note that all quantities 


§19 PARAMAGNETIC SUSCEPTIBILITY 475 


in formulae (19.7) and (19.9) are known. Hence calculated susceptibilities can 
be compared directly with experimental values for systems for which the 
initial assumption (the absence of interaction between particles possessing 
a magnetic moment) is fulfilled. It should be noted that the number of such 
systems is rather small. Most atoms and molecules in the ground state have 
an orbital momentum and spin equal to zero (L=S=0). Substances which in 
the ground state possess a magnetic moment and a paramagnetic susceptibility 
also have a diamagnetic susceptibility. However, the latter constitutes a re- 
latively small (although sometimes quite appreciable) part of the paramag- 
netic susceptibility. To obtain the true value of the paramagnetic suscep- 
tibility it is necessary to add to its measured value the value of the diamag- 
netic susceptibility. It is not a simple problem to obtain atomic paramagnetic 
substances in the gaseous state and measure their susceptibilities. Never- 
theless, the susceptibility of the vapour of K has been measured, and these 
measurements led to a value of the paramagnetic susceptibility of 0.38771, 
which is in agreement with the theoretical value of 0.377-!. The accuracy 
of the measurements is not high, and they cannot be used for a complete 
verification of formula (19.9). 

The most suitable objects for the verification of the theory of paramag- 
netic susceptibility described above are: 

(1) Aqueous solutions or the solid crystalline hydrates of salts containing 
ions with an orbital angular momentum or spin differing from zero. Such ions 
are those of the elements of the rare-earth group and the transition elements 
of the iron group in solutions or crystalline hydrates. In aqueous solutions 
and crystalline hydrates paramagnetic ions are separated from each other by 
a large number of water molecules. Hence the energy of interaction between 
them is very small. The agreement of theory with experiment turns out to be 
excellent. This is confirmed in the example of gadolinium Gd3*. For this ion 
L=0 and S= Z. Its magnetic moment and susceptibility are expressed by 
formulae (19.8) and (19.9). The theoretical dependence of the magnetic 
moment on the quantity wH/kT is shown in fig. IV.8 by a solid line. The 
measured values of M are shown by dots. 

(2) The molecules of paramagnetic gases (Oz, NO etc.). The electric fields 
of molecules have not spherical symmetry; their angular momentum has no 
fixed value and the mean value is equal to zero. If, however, a molecule pos- 
sesses a spin different from zero, then it has a magnetic moment which is 
connected with the spin by the relation (18.14). The number of molecules 
having a spin different from zero is relatively small. As an example one can 
quote the oxygen molecule O, whose spin is S= 1. For the magnetic sus- 
ceptibility of 1 cm? of oxygen in normal conditions we obtain from formula 





HE 





476 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 











(19.9) the value x= 0.142 X 1076. The measured value is equal to x= 
0.143 X 1076. This agreement is excellent. 

We cannot go into a discussion here of more complex cases of molecules 
having a non-zero projection of the angular momentum onto the symmetry 
axis, and cases where one cannot consider the spin of the molecule to be 
oriented freely in space *. 


§20. Ferromagnetism. Spontaneous magnetization and hysteresis 


Ferromagnetism represents a specific phenomenon which occurs only in 
the solid phase and for a relatively limited range of substances. 

As we have stressed earlier, the magnetic properties of ferromagnetics 
differ essentially from those of other bodies. For ferromagnetics not only is 
there no proportionality between the vectors B and H but the induction is a 
complex and ambiguous function of the field, 


B=/(H). 


The value of f(H) depends on the history of magnetization. 
As the external field H increases the induction of the ferromagnetic 
sample increases according to a curve whose characteristic shape is shown 


* See, for example, E.Bloch, Molekularnaya teoriya magnetizma (Molecular theory 
of magnetism) (GTTI, 1936); S.V.Vonsovskii, Sovremenoe uchenie o magnetizme 
(Contemporary teaching on magnetism) (Gostekhizdat, Moscow, 1952); D.H.Martin, 
Magnetism in solids (Iliffe, London, 1967). 


§20 FPERROMAGNETISM, SPONTANEOUS MAGNETIZATION, HYSTERESIS 477 











Fig. IV.9 


in fig. 1V.9. When the field is switched off the induction decreases but does 
not reach zero, so that a residual magnetization remains in the body, which 
exists even in the absence of the external field. This magnetization can be 
reduced to zero by changing the direction of the external field. The repeti- 
tion of the process proceeds according to a characteristic cycle which is also 
shown in fig. IV.9 and which is called the hysteresis cycle. 

The existence of such a connection between B and H means that in the 
relation 


B=H+4mM, 


the mean magnetic moment (magnetization) of the body must be considered 
as a quantity determined by the state of the body as a whole. This means that 
the relation between M and H for a ferromagnetic body has a complex 
character and that the definition of H does not in itself determine the mag- 
nitude of the magnetization. We shall discuss below the properties of mag- 
netization for ferromagnetics and its dependence on the field H and on the 
temperature. However, it should first of all be stressed that, in contrast to 
diamagnetic and paramagnetic substances, the magnetization in ferromag- 
netic substances is not determined by the external field but is an internal 
parameter of the system. 

A spontaneous magnetization can exist in a ferromagnetic body in a state 





| 


478 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch, 3 


of statistical equilibrium. The state of the body will be completely charac- 
terized by the definition of its internal parameters, for example, the temper- 
ature, pressure and spontaneous magnetization. In the equilibrium state the 
thermodynamic potential G(p,7,M) must be a minimum. 

In the presence of an external magnetic field the magnetization of the 
body will be a function of the applied field. The form of this function, for 
given properties of the body, is determined by equilibrium conditions in such 
a way that 


G(p.T,M(H)) > min . (20.1) 


The numerical value of the magnetization in ferromagnetic bodies is very 
large. If the magnetic susceptibility is formally defined as 


k =(0M/H)y-+0 » (20.2) 


then it turns out that its values reach orders of magnitude of 105—106. 

As we have already stressed, the phenomenon of ferromagnetism can occur 
only in solid bodies. It turns out that the magnetization in single crystals 
possesses an important property of anisotropy. The magnetization has 
different values in different directions in the crystal. In the so-called direc- 
tions of easy magnetization it has a large value, while in other directions it 
has a smaller value for a given value of the strength of the magnetizing field. 

The ferromagnetic properties of matter depend strongly on the temper- 
ature. As the temperature increases the spontaneous magnetization decreases 
and at a certain temperature, which is characteristic of a given substance, it 
reduces to zero. This temperature is called the magnetic transition temper- 
ature or the Curie point 0. At the Curie point the substance loses its ferro- 
magnetic properties. At temperatures lying above the corresponding temper- 
ature @ all ferromagnetics become paramagnetics. At T>6 the paramag- 
netic susceptibility depends on the temperature according to the law 


T—6" 


which is called the Curie—Weiss law. 

Finally, the most important property of ferromagnetic bodies is the fol- 
lowing: even if it is homogeneous from the macroscopic point of view, a 
ferromagnetic body is usually divided into regions of spontaneous magneti- 
zation or domains. The size of the domains is very large in comparison with 


§20 FERROMAGNETISM, SPONTANEOUS MAGNETIZATION, HYSTERESIS 479 


the size of molecules and each domain consists of a very large number of 
particles. Within the limits of each domain there is a magnetization different 
from zero. This means that the magnetic moments of all atoms within the 
domain are spontaneously oriented preferentially in one direction, forming 
a macroscopic region of spontaneous magnetization. 

If the body were not exposed to the action of a field, then the magnetiza- 
tion of individual domains in the entire body would, on the average, be 
balanced, and the vector M for the body as a whole would be equal to zero. 

The action of the external field amounts to a re-orientation of the 
magnetization vectors of individual domains such that a magnetization of the 
entire body different from zero is produced. When the field is removed the 
moments of a part of the domains preserve the preferential orientation, 
“remember” the action of the field, and additional work by a field is needed 
to reform a system of domains with a random orientation of magnetic 
moments. 

It should be emphasized that the existence of domains is not a hypothesis 
but represents a phenomenon with has been well investigated experimentally 
in detail. 

All the properties of ferromagnetic bodies mentioned above fit well into 
the framework of the thermodynamic theory of ferromagnetism described 
below. However, the basic problem, the essence of the phenomenon of 
spontaneous magnetization, cannot of course find any explanation within 
the framework of the macroscopic theory. 

The microscopic quantum theory of the phenomenon of spontaneous 
magnetization is described in Part VI of this book. 

Passing on to the discussion of the thermodynamic theory * we have, 
first of all, to write an explicit expression for the thermodynamic potential 
of the system — the ferromagnetic single crystal in a state of statistical 
equilibrium in an external field H. 

We shall consider the behaviour of ferromagnetism near the Curie point, 
when the magnetization is relatively small. 

Let Go(p,7,M) be the thermodynamic potential of the body in the 
presence of an external field. The basic equality for the thermodynamic 
potential in a field reads 


dG = dG y— H: dM. (20.3) 


*In the subsequent exposition in this section we follow L.D.Landau and 
E.M.Lifshitz, Electrodynamics of continuous media (Pergamon Press, Oxford, 1960). 
Our potential y corresponds to the quantity y+ (H? /87) in the book quoted. 








480 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


The difference between this relation and the analogous expression (18.2) lies 
in the fact that the independent variable is the mean magnetic moment of the 
ferromagnetic body, M. Hence, integrating (20.3) with respect to M for a 
given external field, one can write 


G(p,T,M) = Gy(p,7,.M) — M- H. (20.4) 


The thermodynamic potential Gy depends on the magnitude as well as on the 
orientation of the vector M with respect to the axis of the single crystal. 

The effect of anisotropy is relatively small. Hence Go(p,T,M) can be 
written in the form 


Go. T.M) = G(p,T,M) + Go(p,T,M) , (20.5) 


where the second term takes into account the effect of anisotropy. The 
first, basic, term depends only on the absolute value of the magnetization 
vector. 

We shall confine ourselves to the simplest case where the crystal has only 
one symmetry axis which is the axis of easy magnetization. We choose for 
this the z-axis. For the magnetization along the z-axis, ie. M,=M, the 
thermodynamic potential Gy must have its smallest value. In this case the 
dependence of Gy on M can be written in the form 


G2(p,T,M) = const - (M+M?) = const : (M?—M?) = BM? = BM? sin? 0 . 
; y (20.6) 


where BM2 which does not depend on angle can always be included in G}. 
In the small quantity G} we have written out only the first terms of the 
expansion in powers of M. Here 8 is a positive constant, and @ is the angle 
between the z-axis and the vector M. The linear term in this expansion must 
drop out. Indeed, the magnetic moment, which is proportional to the 
velocity of the particles, changes sign under the substitution t > —r. The 
thermodynamic potential of a system in an equilibrium state is obviously 
invariant under this substitution. Hence the expansion must contain a 
combination of the quadratic terms M2, M2, M2. This combination must 
be chosen in such a way that Gp should have a minimum for the magneti- 
zation alone the JEB Since the z-axis is the symmetry axis, the compo- 
nents M2 and M2 must figure symmetrically in the expansion. The expres- 
sion (20.6) satisfies all these requirements. Since B > 0, the z-axis is indeed 
the axis of easy magnetization. The potential Gg has its minimum value 
(Gp=0) if the vector Mis directed along the z-axis (0=0). 


§20 FERROMAGNETISM, SPONTANEOUS MAGNETIZATION, HYSTERESIS 481 
The total thermodynamic potential in the external field has the form 
G=G, — M- H+ M? sin? 0 . (20.7) 


This formula shows that the magnetization M will be oriented by the external 

field on the one hand, and by the natural direction of easy magnetization 

on the other hand. As a result, the vector M will set at a certain angle ô min 

with respect to the z-axis for which the thermodynamic potential has a 

minimum. If the plane in which H and the axis of easy magnetization lie is 

chosen as the (xz)-plane, then the vector M will obviously lie in this plane. 
We write (20.7) in components 


G = Gọ— MH; sind — MH, cos + BM? sin? 0 . 


We shall find the equilibrium orientation, determined by the condition of the 
minimum of G, from the condition 


3G/d0 = 26M? sin 0 cosð — MH, cos@ + MH, sind = 0. 
Hence we easily obtain 
(26M sin 0-H)? (1—sin? 0) = H, sin? 0 . (20.8) 


This is an equation of the fourth power with respect to the quantity sin@. 
It has either two or four real roots, depending on the values of H, and H, 
(for given M and 8) . 

In the first case one value of the root corresponds to the angle 0,,j, for 
which G has its minimum value. This is the equilibrium orientation of M. 
The second value of the root leads to the maximum value of G, i.e. to a 
thermodynamically unstable orientation of the magnetization. 

In the second case there are two minimum and two maximum values of 
the angle 0. 

One of these minima ô min leads to the smallest value of the thermody- 
namic potential. The corresponding orientation of the vector M is the 
equilibrium orientation. The second minimum corresponds to a metastable 
state of the crystal. For this orientation the thermodynamic potential is 
smaller than for any neighbouring orientations but larger than that for the 
orientation at the angle 0 min- 

The existence of metastable states allows one to understand qualitatively 
the origin of residual magnetization and hysteresis. 





482 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch773 


The equilibrium state corresponds to a total magnetization of the crystal 
equal to zero in the absence of an external field. On the contrary, in the 
metastable state the magnetic moment of the body as a whole is different 
from zero in the absence of a field. If in the process of magnetization the 
crystal is brought into the metastable state with a certain total magnetiza- 
tion f M(H)dV and thereupon the external field is removed, then the system 
will remain for a very long time (practically indefinitely) in the state with the 
total magnetization f M(H>0)dV. From formula (20.8) there also follows, 
in principle, the possibility of the existence of domains. Assume that the 
external field is oriented perpendicularly to the axis of easy magnetization, 
i.e. H, = 0. Then (20.8) goes over into the equation 


26M sind —H=0. (20.9) 


If H < 28M, then (20.9) has two solutions 


6, = nae 0, = arcsin (4 = z) 
1 26M ° 2 26M i 
To these solutions there corresponds one and the same minimum value of the 
thermodynamic potential, so that both of these are equilibrium solutions. 
However, the values of the magnetization M will in these cases be respectively 


M0) = M sin, a M®) = M sinb, = —M sind, = -MÝ . (20.10) 


Thus, in the equilibrium state two opposite orientations of the magnetization 
vector are possible. 

If the entire crystal is divided into alternating regions with magnetization 
MY and MÈ, then its thermodynamic potential will be a minimum and its 
state will be an equilibrium state. 

We see that from the formal thermodynamic theory there follows the 
possibility of the existence of residual magnetization (and hysteresis) and 
the domain structure of the crystal. 

The actual realization, magnitude and form of the domains depend on 
many factors. 

For details we refer the reader to the book of L.D.Landau and E.M. Lif- 
shitz and to the specialized literature of this field. 

It goes without saying that the thermodynamic theory does not give and 
cannot give any answer to the basic question of the nature of spontaneous 
magnetization. The existence of the anisotropy energy is also postulated in 
this theory. 


§20 FERROMAGNETISM, SPONTANEOUS MAGNETIZATION, HYSTERESIS 483 


Let us now consider the dependence of the magnetic properties of ferro- 
magnetics on the temperature near the Curie point. At the Curie point the 
ferromagnetic and paramagnetic states of the substance are in statistical 
equilibrium. This equilibrium is a phase equilibrium, and each of the states 
represents a phase, respectively the ferromagnetic phase and the paramag- 
netic phase. 

The thermodynamic potential of the ferromagnetic phase is given by 
formula (20.5). Near the Curie point one can neglect the small term Gz and 
expand G}, in a series in powers of a small quantity (the magnetization M) 
which reduces to zero at the point T = @. Then we have 


GferoP:T,M) = Go(p,T) + aM? + bM4 — M- H, (20.11) 


to terms of the fourth order of a small quantity. 

The odd powers of M drop out of the expansion: under the substitution 
(t>-t) M changes sign whereas G does not. 

At the Curie point T= the coefficient a reduces to zero. Near the Curie 
point one can write 


a~(T-@), (20.12) 
so that a> 0 above the Curie point and a < 0 below the Curie point. Assume 
that b > 0. As will be seen from what follows, for such a choice of the signs 
of a and b the thermodynamic theory well describes the ferromagnetic— 
paramagnetic phase transition. 

The equilibrium condition reads 

ƏG/ƏM = 2aM? + 4bM3 — H=0. (20.13) 
If there is no magnetic field, then in the equilibrium state either 

M=0, (20.14) 
or 

a+2bM2=0. (20.15) 
Above the Curie point a> 0 and the condition of the minimum of the ther- 


modynamic potential is the equality (20.14). Below the Curie point the 
equality (20.15) can be satisfied (since here a<O) and 





484 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


At the Curie point M reduces to zero and the thermodynamic potentials of 
the ferromagnetic and paramagnetic states are equal to each other. As is 
easily seen, the entropy does not change in the phase transition. Indeed, 
above the Curie point 


(222) 
Sela) 
para ar), 


Below the Curie point, by virtue of the relations (20.12) and (20.15), we 


have 
Urzo) ( A) ða i= ) 
is = 2 8a _ 0 Jye 
Sfero ( ar /, aT), MOT DE eee Hel 





aGy 
=- a const : (@—T) = Spara — const : (0-T) . 


Hence it is seen that at the Curie point, for T=6, the entropy remains 
continuous. However, the heat capacity of the system undergoes a jump. 
Indeed, 


aS 
(Cy)terro =T al =const-T +(Cy)para- 


The jump of the heat capacity for T= 9 is 
AC, = const - é, 


the heat capacity of the ferromagnetic phase being higher than that of the 
paramagnetic phase (for b>0), which is in complete agreement with experi- 
ment. The phase transition considered is a typical example of a phase transi- 
tion of the second kind (see §63 of Part III). 

For H #0 one can find the paramagnetic susceptibility from the condition 
of equilibrium. Namely, for H #0 the condition (20.13) instead of (20.14) 
must be fulfilled. Differentiating it with respect to H, we find 


ð (9G)\_ 0M Dy _ A = 
(22) Sy (27*612bM?) — 1 = 0, 


§21 SUPERCONDUCTIVITY 485 


whence 


aM 1 
ƏH  24+612bM2` 


For T>0, M= 0, the paramagnetic susceptibility, in accordance with the 
Curie—Weiss law, is equal to 


l 1 
ilya) 5 
TEO (20.16) 


In conclusion we shall dwell briefly on the calculation of electromagnetic 
fields in the presence of ferromagnetics. The calculation of such fields is 
particularly important in technology, where ferromagnetic materials are 
widely used. 

At first sight it may seem that the non-linear relation between B and H 
makes the calculation of the magnetic field very difficult. In practice, 
however, this is not so. As a matter of fact, the calculation of the field 
distribution outside the ferromagnetic requires the definition of the boundary 
condition at its surface. The use of accurate boundary conditions would 
associate the external problem with the field distribution inside the ferro- 
magnetic, i.e. would make the problem very complex. However, if the ferro- 
magnetic is formally considered as a medium with a magnetic susceptibility 
(20.2) and a magnetic permeability u = 1 + 47K then one obtains for u values 
which are higher by 5—6 orders than those for the external medium. Hence 
one can assume with a high degree of accuracy that u > œ, and the boundary 
condition (5.5) at the surface of the ferromagnetic assumes the simple form 


HXn=0, 


i.e. the lines of the magnetic field are normal to the surface. A ferromagnetic 
in a constant magnetic field turns out to be similar to a conductor in a 
constant electrostatic field. The calculation of the magnetic field inside ferro- 
magnetic bodies turns out, as a rule, to be a very difficult problem. 


§21. Superconductivity 


As far back as 1911 Kamerlingh-Onnes established that the temperature 


dependence of the resistance of mercury differs fundamentally from that of 
normal metals. 











486 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


As the temperature decreases the resistance ceases to depend on the tem- 
perature as in the case of normal metals, and its value is determined by the 
impurities which are present in the sample. However, when the temperature 
is reduced to T= 4.1 K the resistance of the metal falls abruptly (fig. 1V.10) 
to zero. This phenomenon, the discontinuous vanishing of the resistance, is 
called the transition of the metal into the superconducting state, or briefly, 
the appearance of superconductivity. The temperature of this transition is 
called the superconducting transition temperature. 

It is now established that superconductivity is a relatively widespread 
phenomenon. It is known that there are 23 metals which can go into the 
superconducting state. Superconductivity is observed also for a large number 
of alloys. 

There is no doubt that the reduction of the resistance to zero corresponds 
to the transition of the metal into a new state. The resistance of all metals in 
the superconducting state is no more than 10710 of the resistance immediately 
before the transition. Current in a ring made of a superconducting material 
circulates for an indefinitely long time without any sign of weakening. Hence 
it should be recognized that the resistance of superconductors is not simply 
very small but is exactly equal to zero. Electrons in superconductors can 
move without any hindrance. 

Initially it was assumed that the metal in the superconducting state is an 
ideal conductor, i.e. a body with an infinitely large conductivity. According 
to Ohm’s law, for the ideal conductor (i.e. a body with ø > %9) a finite value 








OSA 
R 
Re Pt 
001 |! 
1 
iow / wi 
i 
I 
ES ite fi 
o 41 10 20 
T (K) 


Fig. 1V.10 


§21 SUPERCONDUCTIVITY 487 


of the current density j corresponds to a field strength inside the conductor 
equal to zero: 


E=0. (21.1) 


The absence of an electric field inside the superconductor implies definite 
magnetic properties. That is, from the Maxwell equation 
1 0B 
VX BS 
c ot 
and from the fact that the field E is equal to zero it follows that the mag- 
netic induction in the superconductor has a constant value 


B= const. (21.2) 
The value of this constant is equal to the value of the magnetic induction in 
the superconductor at the moment of the transition into the superconducting 
state. 

It turns out, however, that this inference is contrary to experiment. If a 
metal cylinder is placed in a magnetic field which is perpendicular to the axis 
of the cylinder, and if it is cooled below the transition temperature, then one 
can infer the character of the field inside the superconductor by its distri- 
bution near the sample. It turns out that the lines of the magnetic induction 
are pushed out of the superconductor (fig. 1V.11). The induction inside the 
superconductor is equal to zero: 


B=0. (21.3) 
































Fig. 1V.11 





i 


i 
p 
L 


488 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


From the boundary condition (5.5) the normal component of the external 
magnetic field (H,), (equal to the induction: (He)n=(Be)n) is also equal to 
zero at the surface of the superconductor. In other words, the external mag- 
netic field is tangential to a body which is in the superconducting state. 

Eq. (1.16) shows that, if there is no magnetic induction in the body, then 
in the absence of an alternating electric field the total mean current in the 
volume of the body is zero: 


Dini: (21.4) 

Thus inside the superconductor the total current density is zero. This means 

that in the surface layer of the superconductor a surface current circulates 

having a value such that the magnetic induction in the body is reduced to 

zero. In other words, the mean field of the surface currents compensates for 
the external magnetic field applied to the superconductor. 

From the condition B=0 it follows that inside the superconductor the 


equations 

VX E=0, V - (eE)=0 
hold, so that the electric field inside the superconductor is zero. The surface 
current in the superconductor is not associated with the action of the electric 
field. To determine the surface current density use can be made of the 
boundary condition (5.3). Namely, since there is no magnetic field in the 
superconductor, then (5.3) gives directly (for H® = B®) 


js = 4; MXBO) , 


where B©) is the magnetic induction vector outside the superconductor. If 
the body is simply connected, then the sum of the surface currents is equal 
to zero. On the other hand, in the case of a multiply connected body, for 
example a superconducting ring, the total current can be different from zero. 
The appearance of current in a superconductor may not be associated with 
the action of an impressed e.m.f. but can be produced by the action of a 
magnetic field. A surface current excited in a superconductor can circulate 
for an indefinitely long time without any weakening. 

Experiment has shown that, if the magnetic field strength in the space 
which surrounds the superconductor exceeds a certain critical value H,,, 
then the superconductivity in the sample vanishes. The destruction of the 
superconducting state and the appearance of a resistance take place in a 
discontinuous way. 


N 


SUPERCONDUCTIVITY 489 


un 


The critical value of the magnetic field strength turns out to be a function 
of the temperature, which is given approximately by the empirical formula 


He, const Ta T): 


For T= To Hg, = 0, i.e. any field destroys the superconductivity at T= Tor- 

The consideration of the superconductor as a metal in a special super- 
conducting state allows one to draw certain conclusions about the character 
of the transition into this state. Namely, the superconducting state should be 
considered as a particular phase of the substance, and the transition of the 
metal from the normal state into the superconducting state as a phase 
transition. 

Let us consider the normal-state to superconducting-state phase transition 
occurring in an external field. In the superconducting state the mean total 
current pV is equal to zero and it is impossible to separate from it the part 
which corresponds to the magnetic moment (as has been done in §3). Never- 
theless, we introduce formally the magnetic field strength H; and the mag- 
netic moment M of the superconductor by the relation 


B; =0= H; +47M. 


One can easily connect the field H; with the external magnetic field H}, 
if one considers a long cylindrical superconductor whose axis is directed along 
the field. Then, by virtue of the continuity of the tangential field component 
(which is in this case the same as the total field), 


H; =H, 
and 

M=- H,/4r . (21.5) 
We note that there corresponds to the equality (21.5) the value x = —1/47. 


Thus, there corresponds to the superconducting state the magnetic per- 
meability y = 1 + 47k = O and it is ideally diamagnetic. 

According to (18.2) the thermodynamic potential of a body in the super- 
conducting state can be written in the form 


H 
G(p.T,H) = Gp,T)- Vf M(H)- dH, (21.6) 
0 





490 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER Ch. 3 


where G,(p,T) is the potential of the body in the superconducting state in 
the absence of an external magnetic field. 

Substituting into (21.6) the value of M(H) according to formula (21.5), 
we obtain 


G.(p,T,H) = G.(p,T) + VIH, 2/47 . (21.7) 


When the external field reaches the critical value H,, the superconducting 
state is destroyed. In this case the thermodynamic potential of the super- 
conducting state increases so much, on account of the second term of (21.7), 
that the thermodynamic potential of the superconducting state turns out to 
be equal to the thermodynamic potential of the normal state G, (p, T). At the 
transition point the following relation is fulfilled: 

2 

cr 


VH 
G(,T) += Crn T). (21.8) 





In the expression for the thermodynamic potential G,, we shall omit the 
additional term associated with the magnetic field, since it is very small for 
ordinary diamagnetic and paramagnetic metals. 

Differentiating eq. (21.8) with respect to the temperature and making use 
of the definition of entropy (29.10) of Part III, we find 








g, = 28s Cn _ Vor SH 


soem r e eld Ti 2 


(21.9) 


sS 


where Sp and S, are the entropies of the normal and superconducting states 
at the temperature T. 

We see that, if H, #0, i.e. if the transition occurs at T < Tr, the entropy 
of the system changes in a discontinuous way. The phase transition is 
accompanied by the release of latent heat 


TVH or dH or 
Q= TAS = 4 ee (21.10) 
Experiment shows that in the transition from the normal state into the 
superconducting state heat is always released. 
If the phase transition into the superconducting state occurs without field 
(i.e. if H,, = 0), then 


§21 SUPERCONDUCTIVITY 491 


SS, 5 
and the latent heat of transition is absent. 
We can easily find the discontinuity of the heat capacity if we differentiate 
(21.9) with respect to the temperature. We then have 


as. as EN V 07H, 
(=) asf n) a A =) quran an Sa OT) 
ðT/ y ƏT/y 4n\ðT am CTN 972 
In the absence of an external field we find a simple expression for the jump 


of the heat capacity in the transition from the normal state into the super- 
conducting state: 


dE E oa 


Thus, the phase transition considered represents a phase transition of the 
second kind. The derived macroscopic (thermodynamic) relations and a 
number of other relations which we shall not touch upon * are in good agree- 
ment with experimental data. However, although the thermodynamic theory 
of superconductivity successfully describes a number of properties of super- 
conductors, it leaves open the basic problem: the problem of the physical 
nature of superconductivity. 

Until recently all attempts to create a microscopic theory of super- 
conductivity in which the superconductivity would be associated with the 
properties of the electron gas in the metal failed. They were only crowned 
with success in 1958. The contemporary theory of superconductivity will 
be expounded in Part V. 

In the development of the theory of superconductivity an important role 
was played by experimental data which showed that there was a connection 
between the phenomenon of superconductivity and the character of inter- 
action of the electrons of the superconductor with its crystal lattice. This 
was the so-called isotope effect, which was discovered in 1950 by Maxwell, 
Reynolds and others. It was discovered that the critical temperature Tor 
depends on the isotope of a given element of which the lattice is made up, 
i.e. on the mass of the ions of the lattice. The relation between the mass Mion 








* See L.D.Landau and E.M.Lifshitz, Electrodynamics of continuous media (Pergamon 
Press, Oxford, 1960). 


492 DIRECT CURRENT. MAGNETIC PROPERTIES OF MATTER CRES 


of the ions of the lattice and the critical temperature of the transition is 
given by the empirical formula 


M? „Ta = const , (21.13) 
where the constant has a definite value for each element. 

In conclusion we point to the analogy existing between the superconduct- 
ing state of a metal and the superfluid state of liquid helium II. The super- 
conductivity can be considered as the superfluidity of the electron gas which 
can move without hindrance in the crystal lattice of the metal (see Part V). 











Quasistationary Electromagnetic Fields 


§22. Conditions of quasistationarity 


We now go on to the study of electromagnetic fields changing in time. It 
turns out that there is a wide frequency range for which Maxwell’s equations 
allow an essential simplification. 

In §19 of Part I we considered the case of a slow movement of charges 
where it was possible to completely disregard the retardation in the system 
and to assume the electromagnetic field to propagate with an infinitely large 
velocity. 

In the results of §26 of Part I the condition for disregarding the retarda- 
tion reduced to the requirement 


T>T=L/c, (22.1) 


where T is the period of motion in the system, 7 is the delay time, and L is 
the geometrical extent of the region in which the electromagnetic perturba- 
tions are considered. 

In macroscopic electrodynamics there is a number of important problems 
in the treatment of which one can assume condition (22.1) to be fulfilled. In 
the approximation (22.1), the velocity of propagation of electromagnetic 


493 





494 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


disturbances can be assumed to be infinitely large within the limits of the 
system considered. In this case the values of the fields at a given point will be 
in phase with those at any other point inside the system. It is clear that the 
possibility of neglecting the retardation in the system essentially simplifies 
the study of the corresponding electromagnetic fields. 

In Part I we have called electromagnetic fields in which the phenomenon 
of retardation can be disregarded quasistationary fields. In studying electro- 
magnetic phenomena in matter, quasistationary fields, must, in addition to 
the requirement (22.1) which we shall call the first condition of quasi- 
stationarity, satisfy two more restrictions which we are going to derive. 

The condition (22.1) obviously imposes a restriction upon the period T of 
the variation of the electromagnetic field (or upon the frequency w = 27/T). 
That is, the periods T must be sufficiently large (and the frequencies suffi- 
ciently low) for a given size of system. 

For relatively low frequencies of the electromagnetic fields in conductors 
the following condition is always fulfilled: 


4n. _ 10D 
au == 2 
z j ya (22.2) 


As a matter of fact, it can be rewritten in the form 


aE 


oE>e ar 


= ewE, 


or 
T> e/o. (22.3) 


When the inequality (22.2), which should be called the second condition 
of quasistationarity, is satisfied in the region of space occupied by the con- 
ductors one can disregard the displacement current in comparison with the 
conduction current. 

The third condition of quasistationarity is the requirement that the 
quantities characterizing the properties of matter — o, € and u — should have 
the same value as in constant fields. We shall see in Part VI that the require- 
ment about o reduces to the need for the period T to be substantially longer 
than the mean time between collisions of electrons in the metal: 


T>, (22.4) 


§22 CONDITIONS OF QUASISTATIONARITY 495 


where AÀ is the mean free path, and v is the mean velocity of electrons in the 
metal. For smaller values of T the field manages to change its value appre- 
ciably between collisions, which obviously has an effect on the free path and, 
in the end, on the value of ø. 

We shall consider later the dependence of e on the frequency of the 
external field. 

Numerical estimates show that conditions of quasistationarity for ordinary 
macroscopic systems containing metals as conductors are satisfied up to 
frequencies lying in the infrared part of the spectrum. The whole set of 
conditions determining quasistationary fields turns out to be fulfilled for a 
wide range of phenomena which have the common name “alternating 
currents”. Alternating currents or low-frequency currents have a very wide 
application in technology and laboratory practice. This determines the 
practical importance of the theory of quasistationary processes. 

The equations of the quasistationary electromagnetic field have the form 


vxH=Ëj, 
(22.5) 
v-B =0, 
Wipe. 1 
c ðt 
(22.6) 
V-D =4rp. 


In the equation for the curl of the magnetic field, on the basis of (22.2) we 


have dropped the term expressing the displacement current. The constitutive 
relations are: 


B=uH; D=cE; j=o(E+E™?). (22.7) 


The equation of continuity for quasistationary fields can be written in the 
form 


of 2 = te 2 (G2) {lg 1) TE 
Vilas va Ar =V-j=0. (22.8) 


Thus, the difference between the equations of the quasistationary field 
and those of the field of stationary currents reduces only to taking account 


ard 
ra 








496 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


of the phenomena of electromagnetic induction. It will be shown below that 
the possibility of disregarding the retardation in the system allows one to 
obtain the equations of the electromagnetic field for the case of a system of 
linear conductors in the form of equations in total derivatives, and not poria 
derivatives, with constant coefficients. 

For this purpose it is necessary to go over to Maxwell’s equations in 
integral form. 


§23. The law of induction in moving conductors and media 


First of all it is necessary to find the flux of induction appearing in Max- 
well’s equations written in integral form. For this we have to generalize the 
notion of the flux of induction to the case of moving current loops. 

Let a current loop be moving in a magnetic field. We assume the velocity 
of motion to be constant in space and small in comparison with the velocity 
of light c. Let us find the change in the flux of induction through a conduct- 


ing loop. 
We have, obviously, 
d _ d 
T-A {B-dS. 


By definition 
2 [B-as= in [ao J Ba+Aras f Bwas)], (23.1) 
Ato S2 Sı 


where B(t+At) is the induction vector taken at instant £ + Av, and S, is the 
surface into which the surface S} goes over at the instant ¢ + At. The vectors 
of the normal to the two surfaces are assumed to be oriented in one direction. 

We apply the Gauss—Ostrogradsky theorem to a closed volume (fig. 1V.12) 
consisting of surfaces Sy and S} and a lateral surface X which is formed when 
the loop is displaced from the position S} to the position S3. Since V - B=0, 
i.e. B is a vector having no sources, one can write 


f B(ttar)-dS = 


= f Buran: as +f Busan: dz — f Baran: dS=0. (23.2) 
S2 Sı 


§23 THE LAW OF INDUCTION IN MOVING MEDIA 497 


on . 
L 


vât dl 


Fig. IV.12 


Here the minus sign is associated with our choice of the orientation of the 


normal vector. For the lateral surface © one can, obviously, write (see 
fig. IV.12) 


dE =(dlXv) Ar, 


where v is the velocity of motion of the current loop, and dl is an element 
of its length. Hence 


[B az= [B-(alxv) Ar = Ar f (vXB) dl, (23.3) 


where the integral is taken with respect to the curve bounding the surface 
S, (i.e. with respect to the current loop). Further, to an accuracy of second 
order infinitesimal quantities, one can write 


J Barai) -dS ~ {Bo -dS+ Ar fF -dS. (23.4) 
Sy Sı Sy 
From (23.3), (23.4) and (23.2) we find 


Taa : ƏB ag i 
Sean as~ f Bw astar fa dS- Ar f(vXB): dl. 
2 1 91 


Substituting this expression into (23.1) and passing over to the limit, we find 





498 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


oe - = fp. dS= {a -dS— f(vxB)- dl. (23.5) 


Formula (23.5) shows that a change with time in the flux of induction 
through the current loop can occur either owing to a change in the induction 
vector, or Owing to the motion of the current loop at a non-zero angle with 
respect to the lines of the field (so that the velocity vector is not parallel to 
B). In other words, if the moving current loop intersects the lines of the 
induction vector B, then a change in the induction flux ® in time arises. 

Passing, in the second integral, over to the integration with respect to the 
surface, we can write 


db_ (| 0B 

T [= -¥x(vxB) |- as (23.6) 
We now write Faraday’s law of induction for the moving current loop in 

the form 


fE-a=-12, (23.7) 


where E is the field strength in the conductor. It is clear that the total change 
in the flux of induction must remain on the right-hand side of (23.7) irre- 
spective of the cause of this change. 

The continuity of the tangential component of E allows one to pass over 
from the conductor to a neighbouring contour lying in the medium outside 
the conductor. In this case one should not speak of the motion of the current 
loop but of the motion of the medium. The velocity v denotes then the 
velocity of motion of a given point of the medium. One can pass over from 
(23.7) to the Maxwell differential equation in a moving medium. 

Namely, writing by means of (23.5) 


ðB 


fE- ai= fv xE)-aS=-+f2?-as+* [vx(vxB) aS, 


we have 


vx e= [E vxo]. (23.8) 


§24 MAXWELL’S EQUATIONS IN INTEGRAL FORM 499 


We transform the expression in parentheses, writing on the basis of (1.45) 
and (1.18) that 


2B _ y x(vxB) = Z — (B-y)v+ (v-V) B— v(V-B) + BV -¥) = 
= 2B, (y.yyp= 4B 
= aus V)B a 


by virtue of (22.5) and the constancy of v. 
Then we find finally the value of V X E in the moving medium: 


The importance of the results obtained and, in particular, Faraday’s law 
of induction in the form (23.7), is associated with the fact that the motion 
of conductors in a magnetic field is in practice one of the basic methods of 
producing an e.m.f. (for example, in the dynamo). 


§ 24. Maxwell’s equations for quasistationary fields in integral form and their 
integration for the case of linear conductors 


We can now consider the Maxwell system of equations for quasistationary 
fields in integral form by considering the general case of moving media: 


fE da L (24.1) 
fB-as=0, (24.2) 
faas, (24.3) 
f D-aS=4r fo dV. (24.4) 


According to (22.8), the continuity equation is written in the form 


fi dS= 0. (24.5) 


500 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


Consider the case of linear conductors carrying current. For simplicity we 
confine ourselves first to one conducting loop. We assume that the impressed 
e.m.f. in the loop is defined; the value of the e.m.f. can depend on the time. 
We write the generalized Ohm’s law in integral form as follows: 


j- t= fE- a+ fem - al. (24.6) 


Since we are considering a linear contour and the continuity equation is 
valid, we can write it by means of (24.1), (15.4) and (15.6) in the form 


1 db 
IR=—- — + E(t). 24.7 
EO (24.7) 
Eq. (24.7), which connects the current in a loop with a given impressed 
e.m.f. E(f), is usually called Ohm’s law for an alternating-current circuit. 
If we now consider a system consisting of an arbitrary number N of loops 
carrying alternating current, then for each of these we can write 


1,R;=-- + Et), (24.8) 


where ®; is the flux of induction through the ith conducting loop. 

The magnetic field is determined from eqs. (24.2) and (24.3) which do not 
differ from the equations for direct current. Hence the relations obtained for 
direct current remain valid for all magnetic quantities. However, the equations 
for the magnetic field distribution and the current distribution in the circuit 
turn out to be interrelated: the flux of magnetic induction through the ith 
current loop is determined by the magnetic field distribution; the magnetic 
field distribution is determined by the currents in all the loops. 

It is easily shown, however, that in quasistationary fields the fluxes of in- 
duction turn out to be connected with the current densities by linear relations 
with constant coefficients of the type 


N 
=c Dy Lyly ENE (24.9) 
k=1 


For the proof we shall consider the simplest case of two current loops in a 
medium with magnetic permeability u. 
The current in the second loop (fig. IV.13) produces in the first loop a flux 


§24 MAXWELL’S EQUATIONS IN INTEGRAL FORM 501 


Fig. IV.13 
(2)=n JH- dS, = [(VXA2) dS, = fAa dy, (2410) 


where the index 1 refers to the first loop. 
The vector potential of the magnetic field of the linear conductor carrying 
current Z is given by formula (17.4). Hence 


ul dl} < dl, 
(2) = — Jr Taa (24.11) 


c 


i.e. the flux of induction in the first loop, which is produced by the current in 
the second loop, is proportional to the current Z3 in the latter. The factor of 
proportionality L,> turns out to be a constant which depends only on the 
properties of the medium and the relative position of the loops: 


dl, -dl 
suppl 22 
lip eit im (24.12) 


where rj is the distance between the elements of length dl, and dl}. The 
quantity L > is called the coefficient of mutual inductance. 

Besides the flux of magnetic induction ®,(2), the flux produced by the 
current in the first loop itself 


(1) =cLy yl, , (24.13) 


passes through the first loop. The factor of proportionality L}; is called the 
coefficient of self-inductance. The meaning of the coefficient of self- 
inductance is the following. The flux of magnetic induction through a certain 
segment of the wire is obtained by summing the fluxes from its other seg- 
ments. In all segments of the wire there flows the same current /. 





| 


502 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


It is clear, however, that formula (24.12) cannot be used for the calcula- 
tion of the coefficient of self-inductance. When the segment dl, approaches 
the segment dl, in the same conductor r,;7 > 0, and the integral (24.12) 
diverges. The cause of this divergence lies in the fact that, if the segment dl 
is close to dl}, the transverse dimensions of the conductor become compa- 
rable with the distance 7,4 between these segments. This, in its turn, does not 
allow one to make use of the formula (17.4) which was found for the approxi- 
mation of linear conductors. In other words, in considering the interrelation 
between nearby segments of one conductor the linear conductor approxima- 
tion is undoubtedly incorrect. In later sections we shall see how one can find 
the coefficient of self-inductance without having recourse to the idea of the 
linear character of the conductor. In the meantime we can write the total 
flux of magnetic induction through the first conductor in the form 


@ = cL yl + clia - (24.14) 


Analogously, the flux of magnetic induction through the second conductor 
is 


P7 = cL 2/12 T cL 41; . 


It is easily shown that the equality L2] =£)> holds. Indeed, 


dl, - dl dl, -dl 
SLEAN ELE LE Cc a 
jb yy = Si See ae (24.15) 


The equation (24.7) of the generalized Ohm’s law for the first loop assumes 
the form 


d7; dl 
LR; =- Li | n2 gr tag: (24.16) 


An analogous expression can be written for the second loop. 

In the general case of an arbitrary number of loops the flux of magnetic 
induction through the loop / is expressed according to formula (24.9) in terms 
of currents in all the loops, the coefficient of self-inductance L;; and the 
coefficients of mutual inductance L,. Substituting (24.9) into Ohm’s law for 
alternating-current circuits, we find the whole set of equations which deter- 
mine the current /; in each of the M loops: 


§24 MAXWELL’S EQUATIONS IN INTEGRAL FORM 503 
32 jop 2 + E(t) (i=1,2,...N). (24.17) 


If the impressed e.m.f. E; and the geometrical properties of the system of 
loops determining the whole set of coefficients Ly are defined, then the 
integration of the system of linear differential equations (24.17) with constant 
coefficients allows one to determine the currents I; in the loops. In this case 
the magnetic fields will be determined according to formulae (24.2) and 
(24.3). 

Thus, corresponding to what was said at the end of §22, the equations of 
the quasistationary electromagnetic field for the case of a system of linear 
conductors reduce to equations in total derivatives with constant coefficients. 
The latter, however, as is known from the general theory of differential 
equations, reduce to a system of algebraic equations. We shall carry this out 
in following sections. 

The densities of volume charges arising in certain cases in conductors are 
usually calculated by means of formula (24.4) from a given distribution of the 
electric field. 

As an example of the applications of formulae (24.11) and (24.12) we 
shall calculate the coefficient of mutual inductance of two straight parallel 
linear conductors of length /. For definiteness we assume the currents in the 
conductors to be directed in one direction. We denote by A the distance 
between the conductors. Then, according to formula (24.12), we have 


dl, -dl l 
elere ma ff 


ten ay pg eda les a, 


1 
ò —x, + [x?4+n2] 2 


o dda 


[Ge —x)2+h2] 7 


The calculation of the second integral is somewhat cumbersome. Integrating 
by parts, we obtain 


cyu h a 2412 a seth sft a | 
fa? c2 “(Saat Petit aa lees 


We can simplify this expression for two limiting cases: when A < l or h > 1. 





504 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


In the first case we easily find, with an accuracy to the order of magnitude 
of h2/I?, 


In the other limiting case, with an accuracy to the order of magnitude of 
Ph, 


l 
Liz 7 

In order to compare different methods of calculating L 5 we shall find the 
coefficient of mutual inductance of two solenoids wound around a common 
toroidal core. 

We assume the diameter D of the ring (proportional to the length of the 
solenoids) to be large in comparison with the diameter of the turns. The 
magnetic field inside the toroidal solenoid can be assumed to be uniform, and 
one can write 


fH dl= 21DH = nh, 


where 7, is the number of turns in the first solenoid, and /, is the current in 
it. Whence 


2n; 
jis 
cD 





(24.18) 


The flux of induction through z, turns of the second solenoid (assuming that 
there is no leakage and that all lines of flux of the magnetic field produced by 
the first solenoid permeate the second solenoid) will be equal to 


2un nS 
Lee y 


b2; “AHS = — p 11 » 


where S is the cross-section of the solenoid. From (24.11) we find 
unns 


21 
c?D 


§25 ENERGY OF FIELD OF SYSTEM OF CURRENTS 505 
§25. The energy of the magnetic field of a system of quasistationary currents 


Let us find the energy of the magnetic field of a system of currents in the 
approximation of quasistationary fields. 


As we have seen in §7, the total energy (more precisely the free energy) of 
the magnetic field is 


B-H 


Here it has been assumed that there are no ferromagnetics in the volume V. 
The integration is carried out over all space. 


Expressing B in terms of the vector potential and making use of the first 
of eqs. (22.5), we have 


B-H=H-(vxA)=A-(VXH)+ V-(AXH)= "7 (Aj) + V-(AXH). 
Hence 


ieee fagene | AE oy, (25.1) 


In integrating over all space the last integral reduces to zero, and we find 
finally 


1 
mrn y oN LO (25.2) 


In formula (25.2) A represents the vector potential produced at a given point 
by all the currents, and j is the current density at the same point. 

The integration over the entire volume can now be replaced by the inte- 
gration over the volumes occupied by the current carrying conductors. Out- 
side these volumes j = O and the integral (25.2) vanishes. Hence 


1 ; 
man =z JA (25.3) 


where dv is a volume element of the conductor. 
Let there be X conductors in the entire system. We divide the integral 
(25.3) into two parts, writing 


= = 


506 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 
1 P 1 6 
Tmragn = 5p D fi ii it~ D7 [An jido- (25.4) 
i k>i 


In the integral contained in the first sum, A; denotes the vector potential 
of the magnetic field produced in the volume element dv; of a given conductor 
by the currents which are present in other elements of the same conductor. 

In the integral in the second sum, A, represents the vector potential of 
the field produced in the volume element du; by the currents in all the other 
conductors. Consequently, the first sum represents the sum of the self ener- 
gies of all the X conductors. The second sum gives the mutual energy of these 
conductors. 

An actual calculation of the energy Tmagn encounters great difficulties in 
the case of conductors which are placed in a medium which can be mag- 
netized. Since the magnetic permeability of the conductors, generally speak- 
ing, differs from the permeability of the medium, it is necessary to know the 
distribution of the vector potential in a non-homogeneous medium in order to 
calculate the energy of the currents according to formula (25.4). Finding this 
distribution presents great difficulties. We shall therefore confine ourselves 
to the case where u ~ pgo ~ 1 in the conductors as well as in the medium 
which surrounds them. In other words, we shall consider the conductors 
together with the medium which surrounds them as a quasi-homogeneous 
system. In this case the vector potential at any point of space is given by 
formula (17.4). 

Substituting (17.4) into (25.4), we find 





up hy Di u p 
Tag 5 DSS - 5 D EEN (25.5) 
where j' is the density of the current in the volume element dv; producing 
the field in the volume du;, and r is the distance between dv; and du;. 

Just as the energy of the electrostatic field in the case of a system of con- 
ductors reduces to the sum of the energies of the latter, the energy of the 
magnetic field given by formula (25.5) can be considered as the energy of 
the quasistationary currents. 

We define the coefficients of self-inductance L;; and the coefficients of 
mutual inductance Lz in such a way that the following equality holds: 


§25 ENERGY OF FIELD OF SYSTEM OF CURRENTS 507 


we £ Dff j; dv;dv; m P - j; du;du, u 





> 


I; Il, 
= D Lis aD Ly La o (25.6) 
i S k>i 


2 


The first sum represents the sum of the self energies of the currents in all the 
conductors, while the second sum represents the sum of the mutual energies. 
Formula (25.6) is equally valid for linear and non-linear conductors. 
In the case of linear conductors the definition of the coefficients of mutual 
inductance is the same as (24.12). Indeed, in this case 


Df ae j; du;do, ok Ta e e aan dl, = Diet rok 
k>i 


3 k>i 


where the definition of Lj, is the same as (24.12). 

The new definition of the coefficients of self-inductance is of fundamental 
importance for non-linear conductors. It is free of the difficulties associated 
with an indefinite increase of the integral as r> 0O (see §24). As r> 0 the 
volume elements dudu’ decrease more rapidly than 77}, and the integral tends 
to zero. For this reason, as we shall see below in a practical example, formula 
(25.6) serves as the definition of the coefficients of self-inductance. 

The energy T magn of a set of linear conductors can be written in another 
form. Namely, in the case of linear currents the first term of (25.4) can be 
written in the form 


T magn = 35 DJA -ji dS; - -a= 35 ©; fA: dl 


1 l l 
E2 JVXA) 48;= 5, 21 JB; dSi= 35 Di 


where ®ẹ;= f B;- dS; is the flux of magnetic induction through the ith con- 
ductor. 


Let us find the coefficient of self-inductance for certain simple systems. 








508 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


As the first example we shall consider the coefficient of self-inductance of a 
toroidal solenoid. The field inside the solenoid is given by formula (24.18). 
The energy of the field is equal to 


BH? V _yn2J?S- 27D _ LI? 


87 2ne2D2 2 
hence 


2 
jp 
c-D 


Let us now consider the coefficient of self-inductance of two coaxial 
cylinders along which identical and oppositely directed currents of intensity 
I are flowing. Let the radii of the cylinders be R and Ro, their length /, and 
the thickness of the walls negligibly small. It is obvious that the magnetic 
field inside the small cylinder and outside the large cylinder is equal to zero. 
The field in the gap between the cylinders has the strength 


2I 


cr“ 


The energy of the field is equal to 


R 
JEE ayt cee 
87 c2 Ba r2 c2 R; 
Hence 
R 
-4l n2 
c2 Ry 


§26. Coefficients of self-inductance and mutual inductance for non-linear 
conductors 


In calculating coefficients of self-inductance and mutual inductance for 
non-linear conductors we shall confine ourselves to the case where the mag- 
netic permeability of the conductors and the medium is close to unity (i.e. 


§ 26 INDUCTANCE IN NON-LINEAR CONDUCTORS 509 


where there are no ferromagnetic materials in the system). In this case it can 
be assumed approximately that u has the same value Mg Over all space. In 
other words, we shall assume the entire system to be homogeneous in its 
magnetic properties. 

For brevity of notation we shall assume that the system is made up of two 
non-linear conductors. We shall assume the geometrical form and the current 
density distribution inside the conductors to be invariable in time, and the 
conductors to be at rest and non-deformable. 

We divide the two non-linear conductors into a set of current tubes. The 
possibility of such a division follows from the solenoidal character of currents 
in quasistationary fields. For each tube dSq one can write the relation (24.6), 
i.e. 


Fig SS= fEq: dla + JER? TINE (26.1) 


The relations (16.5) — (16.7), which have been obtained for any solenoidal 
current, are valid for quasistationary currents. Multiplying both sides of the 
eq. (26.1) by Tdi, a quantity which is constant along the conductor, and 
integrating over the entire section of the conductor, we have 


LP fie 08 f F SEa {FP FER" te. osa 





We transform the integrals in (26.2) for example for the first conductor in the 
following way: 


a) dl 
E A — fji: s =f i a a 3, =a Un ¢ -—l= 


a1 


yd 2 as ay 
M i E (263) 
Sı 


er 


where R} is the total ohmic resistance of the first conductor. Here we have 
made use of (16.7) and (16.8). For a linear conductor y= 1 and the ex- 
pression for R is the same as (15.4). Further, on the basis of (23.7) and 
(24.10) we have 


510 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


af) a) ap 
JE feom d SE tg a 








d DE _ ai 
si ie 


Sa - Sy fir Ai dv, (26.4) 


9|- 


S 


Here A, is the vector potential of the magnetic field et a point in the volume 
element dv. 

We shall now, for definiteness, consider the first conductor. Since by 
assumption the conductors and the medium are homogeneous in respect to 
their magnetic properties, we can make use of the expression (17.3) for Aj, 
which for our purposes is conveniently written in the form 


Hh Ho f., dV’ _Hor., di Hop, d2 
E 20) Ai aE e E SEs get erza 
pi Ay ofi = 2 fji + “fie ne X (26.5) 


i 
ri 





The first term gives the vector potential produced by the currents in the 
volume of the first conductor, the second term gives the vector potential 
produced by the currents in the volume of the second conductor, ri; denotes 
the distance between the points to which the element dv} and the element 
dl, are referred respectively, and r) has an analogous meaning. 

Substituting the value of A, into (26.4), we have 


a) 
le = FEL aly = 


Mod ifii: na Ho d Te noi v2 
seal D nade He 25:5) 








We now introduce the coefficients of self-inductance and mutual induc- 
tance, defining them by the formulae: 





1 Lo pi Si cag 
Ke ri 


Ho pda’ pon, 
pare A (26.8) 


Lı = (26.7) 





§ 26 INDUCTANCE IN NON-LINEAR CONDUCTORS 511 
Ly =1L19 : (26.9) 


It is easily seen that L}; and L43 do not depend on the current intensities I 
and /,. Indeed, from (16.8) we have 








Mo prYiYi ly 1; du} dv, 
ipis : ; ! (26.10) 
2 ad ss ri 
Ho 7¥1¥2 li la du; dvg 
= Ă— 2 
hie al! S1So ipl. ape) 


Then, finally, 


do d; dl 
if 7, SEs gdp alin, saat a (26.12) 
1 





and analogously for the second conductor. 
In the integral f /-! d/g f E™P -dl one can assume that the impressed 
e.m.f. is the same over the whole section of the conductor and write 


ae 
J pe FER? aly = c. (26.13) 


Substituting (26.3), (26.12) and (26.13) into (26.2), we find finally 


dl; dl, 
FUR Sitar ele atc (26.14) 
and analogously 
I>R>y = - La Gy ~ La Ape Y ae (26.15) 


Eqs. (26.14) and (26.15) are the same as eqs. (24.17) for linear currents. 
A difference between them lies in the fact that for non-linear conductors 
finding the resistances and coefficients of inductance is a very complex 
problem. All these quantities involve current distributions over the cross- 
sections of the conductors. Therefore the practical importance of the equa- 
tions obtained is not great. To make up for it we have obtained a definition 
of coefficients of inductance (26.7) — (26.9) which is not based (in contrast 
to (24.12) and (24.13)) on the assumption of linearity of the conductors. 





j 


512 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


In conclusion we stress that, as is easily seen, the coefficient of mutual 
inductance defined by formula (26.8) for a linear conductor (Y,;=W =1) is 
the same as (24.12). 


§27. Lagrange’s equations for a system of quasistationary currents 


Up to now we have assumed that the relative position of the current carry- 
ing conductors is given. In practice one often has to consider the more general 
case of moving conductors or conductors changing their shape, relative 
position etc., which are placed in an electromagnetic field. The capacitances, 
self-inductances, mutual inductances and other quantities, which we have 
before assumed to be constant, turn out in this case to be functions of certain 
parameters q; which characterize the configuration of the system. In consider- 
ing such systems it is very convenient to make use of Lagrange’s method in 
which, as will be seen from what follows, the electromagnetic and mechanical 
quantities characterizing the system come in as formally equivalent quantities. 
It turns out that the quoted parameters q;, as well as the values of charges Q; 
in the conductors, can be chosen as generalized coordinates. 

Let us work out the Lagrange equations characterizing the system in 
generalized coordinates q; and Q;. The generalized velocities corresponding to 
the charges Q; are the quantities Q; = /;, i.e. currents flowing in the conduc- 
tors. If the energy of the magnetic field is included in the kinetic energy of 
the system, then the total kinetic energy is equal to 


Tz T mech y T magn = 2 22 Llai!) 1C) I(t) +t Tmech = 


l 
F 2 D; LQ; Ox, + mech > (27.1) 
i,k 


where T mech is the kinetic energy of the mechanical motion of the conduc- 
tors. 

Analogously, we assume the potential energy of the system to be made up 
of the energy of the electric field (i.e. the energy of the capacitances present 
in the system) and the mechanical potential energy: 


Q} a 
(i. == ICD GURRE, (27.2) 


§27 LAGRANGE’S EQUATIONS 513 
Then the Lagrange function will have the form 
l Q? 
Ts l i 
UE 2 2 LikQiQk EEA D Yer + (mech Umech) . (27.3) 
a i 
We also introduce the dissipative function 
l : 
F=> 2) R{apt) OP (27.4) 


and arbitrary “external” forces acting on the system — impressed e.m.f. E}. 
We recall that external forces Foyt in the Lagrange equations represent forces 
which depend not only on the generalized coordinates but also on other 
parameters bearing no relation to the given system. These forces are defined 
by the usual relation 5A = Fext 6g, where 6A is the virtual work done in the 
virtual displacement ôq. 

Then the Lagrange equations for the generalized coordinates Q; assume the 
form 


d aL ðL OF 
ee ee 2 
dedQ; Q; tFext: Sey 


dQ; 
Upon substituting L into (27.5) and assuming that the current loops are 


motionless and non-deformable (constant inductances and capacitances) we 
obtain 


= OG ; 
hp t a —R;Q; + ĉi (27.6) 
i 


which shows that the expressions for the kinetic and potential energies are 
appropriately chosen. 

From the Lagrange equation (27.6) there follows the momentum conserva- 
tion law and the energy conservation law. For asystem which is not acted upon 


by any external forces (i.e. F,,,= 0) and in which no energy dissipation 
occurs (i.e. F = 0) eq. (27.5) assumes the form 


€ OL 
di 80; 


aL 
—=0. 
a0; 


514 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


If there is no capacitance in the ith current loop, then the corresponding 
coordinate is cyclic and the following equality holds for it: 


aL 
= =0. 
dQ; 


In this case the corresponding generalized momentum is conserved: 


L 


30; = = Disp = const. 


Let us now write the energy conservation law for a system of conductors 
at rest. According to general rules, the energy of the system is 


E= D% OE 5 D Ly QiOr + ine (27.7) 


In the presence of dissipative processes 


2- EPE D Eat Oie 


Here 2F is the energy dissipated, and 2 Foyt Ò; is the work done by external 
forces per unit time. 


Substituting the values of £ and F, we write the energy conservation law 
in the form 


2 
d /1 av Q; : f l 
dt G 27 Lip QiOx+y ue) — 2) RÒ? + 27 EQ). (27.8) 
Rewriting it in the form 
d 
ag Tmagn*Uei) = 27(Ed;-Ril?) , (27.9) 


we see that in a closed system which is not acted upon by any external forces 
(€,=0) and in which there are no dissipative forces (R;=0) the energy of the 
system is conserved: 


Tmagm tUa =0. (27.10) 


aF 


Í 





§27 LAGRANGE’S EQUATIONS S15 


In the case where E; #0, R; #0, formula (27.9) shows that the difference 
between the work done by the impressed e.m.f. and the Joule heat released 
goes into an increase in the energy of the magnetic and electric fields as well 
as into the increase in the mechanical energy of the system in the case of a 
system of moving current contours. 

In the particular case of one motionless current loop, (27.9) can be re- 
written in the form 


ose 2)- 5 
Aoa one RE. 





The relations obtained are valid for a system of loops which are coupled 
with each other inductively (i.e. in the presence of coefficients Ly differing 
from zero) as well as for a system of branched loops. In the latter case 
Kirchhoffs law allows one to reduce the number of independent currents /; 
or coordinates Q}. 

In conclusion we note that the results of this section justify to a certain 
degree the term “impressed e.m.f.”, since this quantity indeed plays the role 
of a generalized force acting on the system. 

We apply the results obtained to a single alternating current circuit of 
resistance, inductance and capacitance in series (the so-called RLC-series 
circuit). Eq. (27.6) assumes the form 

CHO) 10) ; 
oo Va eA (27.11) 


If the source of the alternating current (an impressed e.m.f.) represents a 
harmonic function of time 


E(t) = Ep elot 
then the particular solution of (27.11) has the form 
Q= Qo ewt | 


We are interested in the dependence of the current on time, which also can be 
written in the form 


IEEE 


516 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 
Substituting this into (27.11), we obtain 
Io = €,/Z* , 
where the quantity Z*, called the impedance, is equal to 
1 
*=Rt+i = Sle 27.1 
Z*=R ifo e) ( 2) 
Passing over from the complex expression to the real expression for the 
current, we find 
€o cos (wt—y) 
a To 
[R2+{wL-1/wC)?]? 


patton CO 


R 





(27.13) 


We see that forced current oscillations with frequency w, which are shifted 
in phase by an angle y with respect to the impressed e.m.f., arise in the circuit. 

Besides the forced oscillations there can exist free or natural oscillations. 
If the impressed e.m.f. is € = 0, then from (27.11) we easily find 


[=I e, 


where the frequency of the natural oscillations is 


2\2 
pe E 0714) 
For R/2C < (CL)? damped oscillations occur, the damping being charac- 
terized by the time 7=2C/R. For R/2C>(CL)-2 the discharge has an 
aperiodic character. 

If the system contains several loops which are connected with each other 
by corresponding coefficients of mutual inductance, then the system of 
eqs. (27.6) can be solved according to the general rules of solution of a system 
of linear equations with constant coefficients. For a detailed analysis of such 
problems, which is of special interest in electrical engineering, we refer the 
reader to specific literature *. 


* See, for example, W.R.Smythe, Static and dynamic electricity (McGraw-Hill, New 
York, 1950). 


§28 GENERALIZED FORCES IN MOVING CURRENT LOOPS 517 


§28. The generalized ponderomotive forces in a system with moving current 
loops 


In the preceding section we have confined ourselves to the case of motion- 
less loops. We shall now consider moving current loops. 

We shall find, first of all, an expression for the generalized force corre- 
sponding to the mechanical generalized coordinate q; characterizing the 
spatial configuration of the ith current loop. By definition 


=h 
qi ðq; A 


It can usually be assumed that the capacitances which are present in the 
system do not change when the conductors are moving. Then substituting L 
from (27.3), we find 


> ðL om 
ao a ee Graii : (28.1) 
We see that the energy of the magnetic field plays a double role: with 
respect to the coordinates of electromagnetic character, Q;, it represents a 
kinetic energy, while with respect to the space coordinates q; it represents a 
potential energy taken with the opposite sign. Ponderomotive forces act in 
a direction which corresponds to an increase of the magnetic energy.of the 
field. 
We shall apply formula (28.1) to certain practical cases. 
Let us first consider mechanical forces acting on a single current loop. 
Let q be the generalized coordinate characterizing the size of the loop. The 
force acting on the loop on the part of its own magnetic field is 


å [2 OL; 


SUA in nee 
Py a (28.1') 


The force acting on the current loop increases with increasing 0L;;/dq. Since 
the self-inductance increases with increasing size of the loop, this means that 
the magnetic field of the current loop tends to deform the conductor in such 
a way that its size would increase. 

For a system of conductors we shall first consider one moving loop whose 
position is characterized by the coordinate qq. We assume that the configura- 
tion and currents in all other loops in the system are given. Then the force 
acting on the moving loop is 





518 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


aeaa 7 
aa A D hit (aiak) lilk = 





l Lok l OL icy 
== + I, = 
2 2 dda loea 2 dqa lila 


SMe LOD 
z Seah = aie 
Tig D ee (28.2) 


The fixed positions in space of all the loops except the ath correspond to a 
given value of the external field. In this derivation we have made use of the 
equality (24.9): the definition of the flux of induction. 

If, in particular, the generalized coordinate qq represents the angle @ 
which characterizes the position of a flat current loop in an external field, 
then the generalized force Fy represents the torque acting on the loop: 


ðq 
0° 


bal 


TE 
c 


Q 


The flux of induction through a non-deformable flat current loop in an 
external field Bg is equal to 


$= BS cos@ , 


where § is the area of the contour, and @ is the angle between Bo and the 
normal to the plane. Hence 


M=— s Bo sing . (28.3) 


We note that the generalized force (28.2) reduces, in the case of one 
current loop in an extemal magnetic field, to the Lorentz force. Indeed, 
writing for ôP 


5 = B- (8qXôl) = (51XB) -5q, 


we have 


§28 GENERALIZED FORCES IN MOVING CURRENT LOOPS 519 
dF= c7! ((d1XB) = cu! (dIXB)Gj-dS) = c7! (XB) dV, (28.4) 


where dV =dS-dl is a volume element of the current carrying conductor. 
The force (28.4) is the averaged Lorentz force in which the mean magnetic 
field in the medium B appears instead of the field in vacuum H. 

Let us now consider a system of two current carrying conductors and find 
the force of interaction between them. In this case one has to choose the 
distance r; between the elements dl, and dl, of the two conductors as the 
generalized coordinate q. 

From the definition of the coefficient of self-inductance (24.12), (24.11) 
and (28.2) we have 


-dl 
ay ee dla | ai 2 pes 


12 





In particular, let the two currents flow in the same direction in parallel linear 
conductors. Then dl, ll dl, and the force F corresponds to the attraction 
between the two current carrying conductors. 

On the other hand, in the case of currents flowing in opposite directions, 
a repulsion arises between the conductors. 

We see that two currents in the same direction tend to approach each 
other and thus intensify their common magnetic field. It is in this respect 
that they differ from static charges of the same sign, which tend to separate 
and thus weaken their electrostatic field. 

We shall consider in addition the problem of the work which is done by the 
force displacing a current loop. According to (28.2), this work is equal to 


SW Fy bla = Iq Pq 


At first sight we arrive at a paradoxical result: the work 5W is done by the 
force due to the magnetic field acting on the charges. This force, however, 
is perpendicular to the velocity of the charges and cannot do any work on 
them. In fact, however, in writing the formula for the work we have not 
taken into account induction phenomena which occur in a conductor moving 
in a magnetic field. As the conductor moves, an e.m.f. E4 is induced in it, 
and work ôW’ equal to 


6W' = 51 fj- EM ay = 150 JENA - dl=— cl Iy 5by 


is done on the charges. The total work done by the magnetic field on the 
current loop is 


520 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 
bW+dW'=0, 


as was to be expected. 

This result is of a general character and can also be applied to non-linear 
conductors. 

When we have spoken about the work of magnetization and the change 
in the thermodynamic potentials of the system which is associated with it, 
we have meant by this the work done on the currents by the electric field 
induced when the magnetic field is switched on. We write it in another, more 
general form: 


= le = e - = 
ëw = 51 Íj Edy ôr © fE (V XH) dV 
= c S c À = 
=-Lo6tfv (EXH) dV +£ ôr fH (VXE) dV = 


wae ee ath noe 
=~ £ ôr f(EXH) aS- 7-51 fH BEM sibs 


yarns _(.,5B =e al 4 
=H (eF )av=— 3 fu BdV. 


Hence it follows that the body which is magnetized gets an additional energy 
-ô W”. 
Therefore the free energy per unit volume can be written in the form 


= Hl aio 
dF = dFo + zz H dB. 





We define the thermodynamic potential G} as 
G =F+TS - B- H/4r. 

For B= uH we have 
G, =F + TS — pH?/87 . 


| H For dG we have 


§28 GENERALIZED FORCES IN MOVING CURRENT LOOPS 521 
dG= dG; — B: dH/4r . 

Hence 
B=— 4n(0G/0H)7, ‘ 


To obtain (20.3) use can be made of the last formula, expressing B in terms 
of M: 


B= H+ 47M= — 4n(aG/2H)z, - 


Integrating with respect to Hand assuming M to be the independent variable, 
we obtain 


G=G, —M-H- #H?/8r. 


The quantity f (H2/87)dV represents the energy of the external field in the 
volume of the body. 

In conclusion we shall consider an example which illustrates in an obvious 
way the merits of the Lagrange method. 

Let there be two loops: the first carrying a current /, and rotating inside 
the second under the action of a force F, the second loop being at rest. A 
current, />,is induced in the second loop. In both loops there is no capacitance. 
It is clear that such an arrangement represents an alternating-current machine. 
The first loop is called the rotor, and the second loop is called the stator. 
The Lagrange function of the system has the form 


=n 2 1 2 1 
“= gL yyly + Lylly + zbal + 21o(da/dt)? — Umech » 
where « is the angle of rotation, and /g is the moment of inertia of the rotor. 
The coefficient of mutual inductance depends on the orientation of the rotor 
and stator, i.e. on the angle a. 


The equation of motion for the generalized coordinate, the angle a, has 
the form k 


i dL 
[9% — Iila an Moe (28.6) 


where Mo = —3Umech/3& is the moment of the rotational force. 





522 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


The equation of motion for the generalized coordinate, the charge Q3, i.e. 
the equation of the current in the stator, according to (27.5), has the form 


duos 
ran 22> 
or 
d 
gy 2272+211) = -Rah - 
Hence 
dř d/, dL 
ean tale thio ap e a - (28.7) 


The state of the system is to be found from the simultaneous solution of eqs. 
(28.6) and (28.7). 

Assuming the resistance R, to be very large (a large load resistance in the 
circuit of the stator), one can write approximately 


gre, ope dLi2__ J dL 2 da 
2 Ry dt R, da dt’ 





(28.8) 


We have neglected small terms in (28.7) which do not contain Ry. Sub- 
stituting (28.8) into (28.6), we obtain 


2 
jp Sete (Haa)? ae 
O 2 Ra \\ Gey) GEO 
For the quasistationary state one can disregard the term with the angular 
acceleration d2a/dt2, since it is small. Then for the angular velocity of 
rotation we find 


da _MoR2 (S7 a 
=const . 


a7 \dw (28.9) 





The expressions (28.8) and (28.9) give the solution of the problem. In practice 
one usually makes use of more complex schemes, for example machines with 
self-excitation in which the induced current /5 is introduced into the first 


§29 FLUCTUATIONS IN CONDUCTORS 


wn 
N 
w 


We have not considered the important properties of real machines asso- 
ciated with the presence of magnetizable cores. The example discussed 
illustrates only the merits of the Lagrange method in the case of systems in 


which the mechanical motion and currents are directly connected with each 
other. 


§ 29. Fluctuations in conductors and the Nyquist formula 


As a result of fluctuation processes in an electric circuit current fluctu- 
ations arise which in practice are called noise. Physically the appearance of 
fluctuation currents in a conductor (in the absence of impressed e.m.f.) is 
associated with the fluctuations of the number of electrons moving in a 
particular direction. In the presence of an impressed e.m.f., fluctuation 
currents are superimposed upon the stationary or quasistationary current. 

Current fluctuations in a radio set are of very great importance in radio 
engineering. The noise background determines the limit of sensitivity of 
reception of a single signal. A further increase in sensitivity can be achieved 
only by repeated measurements. 

We shall consider the theory of fluctuations in an electric circuit with 
inductance L and ohmic resistance R. Fluctuation processes in the circuit 
can be characterized by a random fluctuating e.m.f. E(t). The variations of 
the e.m.f. E(¢) occur in a time which is very small in comparison with the 
relaxation time T = L/R of the circuit. 

It is natural to try to describe the processes occurring in the circuit by 
means of the generalized Ohm’s law 


LË +Ri= E4). (29.1) 


Eq. (29.1) is based on the rather natural assumption that between the 
random current in a circuit and the random e.m.f. producing it there is the 
same relation as between an ordinary current and the ordinary e.m.f. The 
current relaxation is characterized by a constant resistance R. It should be 
stressed that at high frequencies the resistance R (or ø = R7}; see § 14) turns 
out to be a function of the frequency. This fact does not however affect the 
general results of the theory. 


The random e.m.f. E(t) has, obviously, the following properties: 


F(IN=0 [E12 £0 (29.9) 





524 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


We integrate the linear equation (29.1) with respect to time. We then have 
for the random current: 


H(t) = ig UT + eT fet'/T E(t") ar . (29.3) 


It should be noted that the integration of eq. (29.1) with the random func- 
tion on the right-hand side requires some justification from the purely 
mathematical point of view. For the details of this problem we refer the 
reader to more specialized litereature *. 

By means of formula (5.3) of Part III one can work out the correlation 
function: 


t 
(i(t) i(t+7)) -({i e™t/T + @-t/T if et'/T E(t') ar | 
0 


t+r 
x [io EMDT eHIT f et"IT E(t") a’]) = 
0 


= if e7(2t+7)/T 


t itr 
+ e-(2ttr)/T if if e "YT E(t’) E(t") dt'dt” . (29.4) 
0 0 


Terms containing the random function E(t) in the first power reduce to zero 
in averaging. 
The double integral in (29.4) can be calculated in the following way. 
Introducing new variables x = ¢' + t”, y = t' — t", we have 
20 t 


Ier)=z HT f ex!T ax f (éGx-2y) EGxtiy) dy. 
0 =r 


The correlation function (€(4x—3) €(3x+4y)), cannot, according to (5.4) 
of Part III, depend on the choice of the variable x. Assuming that x = y, 
we have 


* See. for example, S.Chandrasekhar, Stochastic problems in physics and astronomy, 


§29 FLUCTUATIONS IN CONDUCTORS 525 
(EGx—2¥) EQxt2y) = (E(0) EO) . 
Since the correlation function rapidly decreases with increasing y, the range 


of integration can be extended to infinity, so that 


2t °° 
K(t,r) = be-2tt+ nT f e&/T dx yfi (E(0) EQ) dy = 
0 -00 


=T eT (1—62) f (E0 EOY dy 


-co 


Finally we find 


it) i(t+7)) = i eT +37 IT (1—e-2t/T) if (E(0) EX dy. (29.5) 


-00 


Assuming t=0, we obtain from (29.5) the correlation function for the 
current 


GCO) i(r)) = iĝ e7/T . (29.6) 

Making use of the law of equipartition, one can write the mean value of 

the fluctuation current in the circuit ZLig, = kT = 30 so that finally the 
autocorrelation function for the current assumes the form 


GCO) i(r)) = OL7! em HVT (29.7) 


On the other hand, assuming 7 = 0, we find 


G(t)?) = i eUT +37 (1-e2/T) f (EC) EQ) dy. 


-°o 


For large values of ¢, t> T, a statistical equilibrium must be established in 
the system. Dropping as small the quantities with 7 2e/T we find 


ADOL =T | EO EW) ay. 


Hence we find, substituting T = LR}: 


eS ee nee oe 





526 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


R=}L20-1 | (EO) EQ) ay. (29.8) 


-00 


Formula (29.8) relates the autocorrelation function of the random e.m.f. and 
the resistance of the circuit. This relation has a deep meaning: it connects a 
characteristic of reversible random processes, the correlation function 
(€(0) E(v)), with a characteristic of the irreversible dissipative process, the 
relaxation time T or the resistance R 


T-1=310-! | (EO EQ) dy. (29.9) 


-0 


If one makes use of the Wiener—Khinchin formula (5.16) of Part III, then one 
can express the correlation function in terms of the spectral density. Then we 


have 


tr) f (e@eoyay= ff ei gw) dwdy = (0), 
me = (29.10) 
and, consequently, 
R=4L20-! 9(0). (29.11) 


This formula is a particular case of an important fluctuation—dissipation 
theorem establishing a connection between the characteristics of fluctuation 
processes and dissipative processes. 

In formula (29.1) we have restricted ourselves to the case where the 
resistance can be assumed to be independent of the frequency. The problem 
of the connection between fluctuation processes and irreversible processes 
will be considered in detail in Part VI which is devoted to physical kinetics. 

We return to the autocorrelation function for the current and, making use 
of the Wiener—Khinchin theorem, we find the spectral density of random 
currents in the circuit. Namely, according to (5.16) of Part III and (29.11), 
we can write 


glo) =m! f OOil) ei dr, (29.12) 


§29 FLUCTUATIONS IN CONDUCTORS 527 


where the spectral density of the current gw) is defined by the formula 


i= f g{w) dw. (29.13) 
0 


Substituting the expression for (i(O)i(r)) from (29.7) into (29.12), we find 


co 
ER: IT am 1 5 
= ee 29.14 
gw) zÍ : e cos wt dr= TR 1+ (@L/R)2 ° ( ) 


Usually in metals T= LR} ~ 107!3 sec, and one can write to a good approxi- 
mation that 


gw) ~ On R- . (29.15) 


Thus, the mean-square value of the current in a closed circuit generated by a 
random e.m.f. in a frequency interval w, w+ dw can be written in the form 


P dw =On-1R-! dw . (29.16) 


In the case of an open circuit, instead of (29.16) one can write for the 
spontaneous fluctuations of e.m.f. the relation 


€2 dw= OR! dw. (29.17) 


Formulae (29.17) and (29.16) are called the Nyquist formulae. They allow 
one to take into account fluctuation phenomena in calculations on quasi- 
stationary circuits, for example, the fluctuation e.m.f., besides other macro- 
scopic characteristics. 

According to (29.16) the fluctuation current is proportional to 9? and 
inversely proportional to R?. This has a simple meaning. The resistance 
R =o7! ~n7!, where n is the number of electrons ns per cm’. In correspond- 
ence with the agail formula (3.7) of Part III, (i? J? cana 

The above derivation was made under an essential restriction: the resis- 
tance has been calculated without taking into account the frequency. In Part 
VI we shall return to the Nyquist formula for the region of high frequencies. 
A detailed calculation will be made of the quantum effects at frequencies 
w>kt/h=6/h. 

In semiconductors an additional noise-producing mechanism arises which 
will also be discussed in Part VI. 





528 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


§30. Skin effect 


We have, up to now, studied quasistationary currents in linear circuits, 
considering the conductors to be infinitesimally thin. We shall now consider 
the distribution of alternating current over the cross-section of the conductor. 
Assuming, as before, that the conditions of quasistationarity are fulfilled, we 
write Maxwell’s equations in a homogeneous conducting medium: 


_ MOH 
V Kk Beas es 
vx H=- 2, 
v-H =O, 
V-E =0. 


It is easy to obtain separate equations for the electric and magnetic fields. 
Taking the curl of curl E, we find 


V X (VXE) = V(V-E) — V2E=— vE=# 2y XH), 


or 


4 dE 
V-E= TE ari (30.1) 


A similar equation is obtained for the magnetic field. 

Eq. (30.1) and the analogous equation for the vector H determine the 
dependence of the fields on time and coordinates in the space occupied by 
the conductor. At the boundary of this space, i.e. at the surface of the 
conductor, the field vectors satisfy the usual boundary conditions. 

We restrict ourselves to the solution of the field equations in the simple 
case of alternating current flowing in a conductor which occupies a half-space 
z > 0. The direction of the current is along the x-axis and its dependence on 
time is assumed to be defined: 


is = j(z) elt ; fy =i =0). 


§30 SKIN EFFECT 529 


We note that j, cannot depend on the coordinate x by virtue of the continuity 
equation which gives 

dj, 

Ox 


In correspondence with Ohm’s law we seek the electric field satisfying 
eq. (30.1) in the form 


a SIA) GEL 2 E, =E, = 0. (30.2) 
The substitution of (30.2) into (30.1) gives 


d2E(z) = 4nuow 
dz? c2 


E(z). (30.3) 
The general solution of the last equation is 
E(z)=A e7kz + B ekz, 


where 


N (i gS — ae (ze) 
k=(i TE P= eee)’. 
g2 VATA 
We introduce the notation 


= -1 f (30.4) 
(2nuow)2 
Then 
E(z)= A e? ez + B eŻ/6 ez% | 
The value of B must, obviously, be equal to zero, in order that the field may 
have a finite value everywhere. 
Thus, finally, 


PATA e7z/5 e-iz/5 wi) | (30.5) 


530 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


Formula (30.5) shows that the electric field strength decreases exponentially 
inside a conductor. 

An effective decrease in the field strength (decrease by a factor of e) takes 
place at a distance 6 from the surface. 

Knowing the electric field distribution in the conductor, one can find the 
magnetic field distribution. 

We have, obviously, 


aE 
jHO a = ~ 235 Ey, 
-i2 H, =(VXE) => = Ey, 
H, =H, =0 
Hence 
Hy, = aA e-i@/5-w 6-2/6 , (30.6) 


The magnetic field turns out to be perpendicular to the electric field. It 
decreases inside the conductor according to the same law as the electric field. 
In absolute value lH, I~ (c/w) lE; |, i.e. is larger than the electric field by a 
factor A/d. 

Thus, the electromagnetic field and correspondingly the entire current in 
a conductor turn out to be localized in a thin surface layer of thickness ô. 
The localization of the field in the thin surface layer is called the skin effect, 
and the quantity 6 is called the skin depth. 

It is easily seen that the entire Joule heat j2/o is released in the region of 
the skin depth. 

From the definition of 6 it is clear that its numerical value can vary 
within a wide range for different frequencies w and conductivities ø. In order 
to get an idea of the order of magnitude of this quantity, we point out that 
for copper ô ~ 1 cm for a frequency of 50 hertz and ô ~ 3 X 1073 cm for fre- 
quencies ~ 10° hertz. 

As œw > 0, i.e. in the transition to direct current, 5 > œ and the skin effect 
vanishes. The current is uniformly distributed over the entire cross-section 
of the conductor. On the other hand, in the transition to the limit o > © the 
thickness of the skin layer tends to zero. This represents the case of the ideal 
conductor. 

The result which we have obtained for the case of a simplified model of a 
conductor is of general character. For any geometric configuration of con- 








§30 SKIN EFFECT 531 


auctors the field in them turns out to be localized in the skin depti. 

If the distribution depends on the geometric properties of a conductor, 
then the problem of finding the field in it becomes only slightly more com- 
plicated than in the example discussed. Thus, in the case of a current flowing 
along a long cable of circular cross-section, solid or hollow, the current 
density vector is directed parallel to the generatrix of the cable. Hence the 
general trend of the solution of the problem of finding the field distribution 
in the cable is the same as in the example which we have discussed. For the 
skin depth one obtains the same numerical value. The geometric distribution 
of the field differs relatively little from an exponential decrease, particularly 
at high frequencies, when the radius of the cable is large in comparison with 
the skin depth. In the general case of a solid conductor of arbitrary form 
finding the alternating-field distribution represents a problem which is com- 
plex from the mathematical standpoint *. However, irrespective of the form 
of the conductor and even of the mechanism of excitation of the field in it 
(by means of an impressed e.m.f., an external alternating magnetic field etc.) 
the general inference remains valid: the alternating electromagnetic field 
penetrates into the conductor to the depth of the skin layer 5. In this case the 
magnetic field is larger than the electric field in the ratio A/ô. 

Analogous results can be obtained in the consideration of another problem. 
Let the metal be acted upon by an alternating electromagnetic field depend- 
ing on time according to the law eS. 

If the magnetic field at the surface of the conductor z = 0 has. a value Hp, 


the boundary condition (5.4) allows one to formulate the boundary value 
problem 


02H(z,t) _ 4nou H(z, 1) 





3z2 c2 ðt a 
H= Hy for z=0, 
H> 0 for ZOO 


The solution of the boundary value problem can be written, analogously to 
(30.5), in the form 


H(z, t) = Hy e7% elot , 


where the thickness of the skin layer ô is given by formula (30.4). 


* See W.R.Smythe, Sratic and dynamic electricity (McGraw-Hill, New York, 1950). 





532 QUASISTATIONARY ELECTROMAGNETIC FIELDS Ch. 4 


It should be emphasized that the skin effect becomes more pronounced in 
going to high frequencies. We shall see in the next chapter that in high-fre- 
quency fields, when the fields can no longer be considered as quasistationary, 
there is as before a skin effect, although the penetration depth turns out, as a 
rule, to be different (see §33). The skin effect plays an important role in 
alternating-current engineering. It allows one to use hollow cables or cables 
covered with a layer of metal of a particularly high conductivity, which 
reduces the expenditure of material and power. 





High-frequency Fields 


§31. Electromagnetic waves in a homogeneous isotropic medium 


We consider the propagation of the electromagnetic field in a spatially 
homogeneous and isotropic medium characterized by material constants €g, 
Ho and gp, i.e. in a medium without spatial dispersion. The subscript zero 
denotes that the material constants have a static value and are referred to the 
frequency w = 0. We shall discuss below the conditions of applicability of this 
assumption. 

We assume the medium to be non-ferromagnetic (ug~1). We have seen 
above that for metals the penetration of the field is very small, even at 
relatively low frequencies. Therefore it makes no sense to consider the 
propagation of the electromagnetic field inside the metal. However, such a 
consideration is not out of place for media having conductivities which are 
lower than those for metals (e.g. semi-conductors or solutions of electrolytes). 
In the limiting case g > O we pass over to the case of ideal dielectrics. We can 
also assume that the magnetic permeability of the medium is equal to unity. 
Maxwell’s equations then have the form 
_4no p, £0 9E 


vo GRO 


E (31.1) 


ee 





534 HIGH-FREQUENCY FIELDS Ch.5 


1 0H 
31S ee (31.2) 
V-E=0, (31.3) 
V-H=0. (31.4) 


Taking the curl of the first equation and making use of the last, we have 
Eq 
Y X (VXH) = —y2H= 22 a (VXE) +— 2 (VX), 


or, using (31.2), 


En 92 
y2H £ H 4mo OH (31.5) 


y2g— £0 2 E 4no 3E ; (31.6) 


We seek solutions of eqs. (31.5) and (31.6) in the form of monochromatic 
plane waves propagating along the x-axis. 

Consequently, we assume that the solutions of (31.5) and (31.6) have the 
form 


Hx, £) = Hy) dot, (31.7) 
E@, 1) = Ey) elt (31.8) 
Substituting (31.7) into (31.5), we find 


oe, k2Ho(x) = 0, (31.9) 


where k denotes the cormplex quantity 


-2 fe _; 400\7 
k=< le z ) ; (31.10) 


§31 HOMOGENEOUS ISOTROPIC MEDIUM 535 


We introduce the important notion of the complex dielectric permittivity 
e(w) in a conducting medium, defining it by the relation 


k= [ey]? = kyle)? , (31.11) 


where k, = w/c is the wave number in vacuum. 
From (31.10) it follows that 


4 
ec) = € — i=. (31.12) 


This important formula establishes the relation between the dielectric per- 
mittivity and the conductivity. Further, we write the complex quantity 
[e(w)]2 in the form 


[e(w)]? =n — ik. (31.13) 
The quantities n and x are called, respectively, the refractive index and the 


absorption coefficient. The sign of k is chosen in such a way that the imagi- 
nary part of k is essentially negative (for x > 0). The reason for such a choice 


is obvious from what follows. 
The solutions (31.7) have the form 
H= A, en~kvx eer kyr) (31.14) 


where the vector A, is the complex amplitude. Analogously, for the electric 
field one can write 


E= A; em hve elwtkynx) | (31.15) 


In the case of arbitrary direction of propagation the field vectors can be 
written in the form 


H= A, elorkn , (31.16) 
E= A, clerk | (31.17) 


where k is the wave vector, k= kK, and kg is a unit vector in the direction 
of pronagation of the wave. The relation between the vectors Hand Ecan be 





536 HIGH-FREQUENCY FIELDS Chess 
VX E=- i(kXE) =- ik(k)XE) = — = [e(w)]? (kyX BE) = — i2 H. 


Hence we find 
H= [e(w)]? (kyXB . (31.18) 


In contrast to the case of the propagation of electromagnetic waves in 
vacuum, the amplitudes of the electric and magnetic field here turn out to be 
different. However, as in a vacuum, electromagnetic waves in a homogeneous 
and isotropic medium are transverse. Indeed, substituting (31.16) and (31.17) 
into (31.3) and (31.4), we find 


k-E=k-H=0. (31.19) 


Formulae (31.16) — (31.19) show that Maxwell’s equations allow solutions 
in the form of transverse plane waves with wave number k and an arbitrary 
frequency w in a homogeneous and isotropic medium, as in vacuum. However 
damping of the wave according to an exponential law occurs in the medium. 
The effectiveness of the damping is determined by the quantity x. We find 
the values of n and kx from (31.12) and (31.13). Squaring (31.13), substituting 
in (31.12) and separating the real and imaginary parts, we obtain 


n?—K?=6,, kn = 27a/w . (31.20) 
The solution of (31.20) gives 
1 1 
n= [3(€5+16120/w)2+€,]2 ; (31.21) 
1 1 
k= [2(€§+161?0?/c2?)? -€9]?. (31.22) 
The signs of the roots are chosen in such a way that n and x have real values 
and, in addition, that k is positive. 
Formulae (31.21) and (31.22) determine the law of dispersion in a con- 
ducting medium. It should be noted that at sufficiently high frequencies the 


conductivity g is also dependent on the frequency.. 
Let us consider the limiting cases of formulae (31.21) and (31.22). If the 


inequality 


o < €,,02/47 (31.23) 


§31 HOMOGENEOUS ISOTROPIC MEDIUM 537 


is valid, then this means that the conduction current gE is small in comparison 
with the displacement current (€,/4n)(OE/A1) ~ (€qw/4z)E. This holds for 
ideal dielectrics (for which ø > 0) as well as for real dielectrics possessing a 
very low conductivity, and also for conductors of non-metallic type (semi- 
conductors, electrolytes) at sufficiently high frequencies. 

In this case one can simplify formulae (31.21) and (31.22), writing them 
in the form 


n eh A (31.24) 
k ~ 2nojwes . (31.25) 
By virtue of the inequality (31.23), the following inequality also holds: 


n>kK. 
If one can completely disregard the quantity x which, strictly speaking, is 
equal to zero only in an ideal dielectric, then the medium is called trans- 


parent. For a transparent medium the formulae for the field vectors are 
simplified and assume the form 


E= Aexp| i( wr -= = e ko r)| = Aexp [i(wi—k-n)] , (31.26) 
H= e Acxpfi(or - 2 ee ko +)| = e Aexp [i(wt-k-r)] , (31.27) 


=2 eh ky == ko. (31.28) 

where v is the phase velocity of propagation of the waves 
v=ces? . (31.29) 
The last formula justifies the name of the quantity n. Indeed, as is well known 


(see §37), the refractive index means the ratio of the velocity of propagation 
of electromagnetic waves in vacuum to that in the medium: 


(31.30) 


One 


c 
(SG 
v 


} 
i 





538 HIGH-FREQUENCY FIELDS Ch. 5 


Electromagnetic waves in a non-conducting medium differ from those in 
vacuum only in the velocity of propagation, which is smaller by a factor of 
E than the velocity of light in vacuum c. Moreover, the amplitudes of the 
electric and magnetic fields are in the ratio 


wee =n. (31.31) 
The above ratios, established by Maxwell, played an important role in the 
history of the development of the theory of the electromagnetic field. In 
particular, formula (31.30) established a relation between electromagnetic 
and optical phenomena, which before the work of Faraday and Maxwell 
were considered to be completely independent. 

Formula (31.30) was subjected to extensive experimental test for a large 
number of liquids and gases. Good agreement with experiment was found 
for a number of liquids and gases in the visible and infrared parts of the 
spectrum. However, the relation (31.30) is not satisfied at all for liquids 
(e.g. water) and gases whose molecules have a considerable intrinsic dipole 
moment. Other restrictions on the applicability of (31.30) are associated 
with the existence of absorption and dispersion. : 

In the case of weak absorption it is convenient to write e? in the form 


[e(w)]? = (n2+k 2)2 exp [—i arctan (k/n)] (31.32) 
and to write the formulae for the field vectors in the form 

E= Aexp [—kkyt] exp [i(wt—nkgr)] , (31.33) 

H~ n(kyXE) exp [—iarctan(k/m)] . (31.34) 
The magnetic field strength differs in phase from the electric field strength 
by an amount arctan (k/n). 


In addition we shall find the time average of the electromagnetic field 
energy density in a weakly absorbing medium. We have, obviously, 





oe 2 

£0 E2 = Egå e72 

8T 167 à 

= 2 

HEE SEY eel aaa 
87 16r 16r A 


§31 HOMOGENEOUS ISOTROPIC MEDIUM 539 


since n2? +K? ~n? = €, - As in vacuum, the energies of the electric and mag- 
netic fields in the wave are nearly equal to each other. 

Let us now consider the opposite limiting case where the conduction 
current is large in comparison with the displacement current. Formally, this 
case is obtained immediately from the dispersion formulae for low frequencies 
and large conductivity. We have, obviously, 


k = (2no/w)? , (31.35) 
n =r = (2no/w)2 . (31.36) 


As is shown by comparison of (31.35) with (30.4), the attenuation of the 
field in a medium takes place over the thickness of its skin depth. This repre- 
sents the case of strong absorption. 

It is clear, however, that these formulae have a somewhat relative charac- 
ter, since it is difficult to speak of the propagation of waves if they are 
damped over the thickness of the skin layer. 

In conclusion, let us write the energy conservation law for a medium with 
frequency dispersion. Assuming the medium to be in a state of thermody- 
namic equilibrium at constant temperature, we can write for the change of 
internal energy per unit volume 





dU 
0_dQ,dW_dQ, E aD 
dt dt dt dt a rae. 


We average eq. (31.37) over time, assuming that the temperature is main- 
tained constant. Then, on the average, the energy of the medium in an 
external periodic field remains constant, so that for the heat released we 
obtain 


ggQ__EaD 

dt 87 dt 
Setting 

D = [e€;(w)tiex(w)] (Ep cos wttiEy sin wr) , 
we have 


z2 
d EWE 
£ se. (31.38) 


| 
| 





540 HIGH-FREQUENCY FIELDS Ch. 5 


Thus release of heat determined by the quantity €(w) occurs in the medium. 
Formula (31.38) makes sense if the heat released can be removed and the 
temperature of the medium maintained constant. Since heat is emitted only, 
Q is always non-negative and, correspondingly, 


€,>0. (31.39) 


The sign of e} can be either positive or negative. 


§32. Dispersion relations (the Kramers—Kronig formulae) 


We have already pointed out in §6 that the phenomenon of dispersion 
occurs at high frequencies, when the frequency of the radiation is com- 
parable with characteristic atomic frequencies and its wavelength is compara- 
ble with the dimensions of spatial non-uniformities in the medium. 

In the case of a homogeneous and isotropic medium the general formula 
(6.5) can be written in the form 


t 
Dor.) = f ar fer, t) Eerr)av’ . (32.1) 


The function e(|r—1'|, t-t") depends only on the difference between the coor- 
dinates, since in a homogeneous medium there are no unique points. A 
contribution to D is given by the change in polarization which is transferred 
in a time t — ¢’ over a distance Ir—r'|. The integral is taken only with respect 
to the past, in order to satisfy the principle of causality. It is obvious that 
lr—r'l<c(t—r'), but for simplicity we shall not take into account this 
restriction *. 
We expand the electric field and induction in Fourier integrals: 


al 
E(r, ^) Gay 





[E(kw) eit) dkdes , (32.2) 
el i(k: 

D(r, t) =— f Dkw) ikr) dkdw . (32.3) 
(27)4 


Then, substituting into (32.1), we find 
D(k,w) = e(k,w) E(k,w) , (32.4) 


* See M.A.Leontovich, Soviet Physics JETP 13 (1961) 634. 


§32 DISPERSION RELATIONS 541 


where 


t 
e(k,w) = f ar f e(lr-r'l, tr) et (1) e7iwlt-t’) gy’ = 


= f arf (Rr) etk Ron aR, (32.5) 


-00 


Here new variables R=r-—r' and 7=ʻ¢-— t' are introduced. The function 
e(k,w) is not the Fourier transform of e(R,r), since the integral with respect 
to 7 is taken in the range from zero to infinity. It is the function e(K,w) (and 
not e(R,7)) which is of basic importance, because, according to (32.4), it 
relates the vectors D and E. It is called the dielectric permittivity, and depends 
on the frequency and the wave vector. 

We note that the dependence of e(k,w) on the wave vector K is associated 
with the dependence of e(R,r) on the space coordinates R, i.e. with the 
taking into account of spatial dispersion. 

If the wavelength is small compared with the spatial non-uniformities, 
then one can assume the dielectric constant to be a function of ¢— t’ only 
and, instead of (32.5), one can write the simpler relation 


co 


e(wo) = f elr) erie ar, (32.6) 
0 


in which the dielectric constant depends only on the frequency. In this case 
the relation (32.4) is simplified and becomes 


D(w) = e(w) E(w). (32.7) 


We note that if the dielectric constant depends on the frequency, then only 
eq. (32.7) represents the correct form of notation of the constitutive 
equation. The relation 


Dr, t) = e(w) E(t, t) 


(where D and E are the vectors themselves and not their Fourier amplitudes, 
as in (32.4)) which is often presented in old text-books, makes no sense. If 
e is a function of the frequency, then eq. (32.4) must contain quantities 
referring to this frequency. 

The function e(k,a) is, in general, complex and can be written in the form 


542 HIGH-FREQUENCY FIELDS Ch. 5 
e(k wo) = e™® (k,w) + ie™ (kw) . 
It is obvious that the real and imaginary parts are 


~ (kw) + e*(k,w) 


e® (k,w) 7 
(32.8) 
em (Keo) = Seo) = ees) 
Moreover, from the definition (32.5) it follows that 
e(k,w) = e*(—k, —w) . (32.9) 


Substituting the expression (32.5) into (32.8) and taking into account (32.9), 
we find 


e (kw) =} f dr f e(RIIr ek R-o) aR. (32.10) 
Analogously, 


em (kw) =z J dr fe(IRI rf eR) aR 
0 


0 


Lf ae fear peat Ron ar - 


=Æ J dr f eARLIrl) eR senz aR, en 


where the sign function sgnz is determined according to formula (III.16). 
Since the real and imaginary parts of e(k,w) are expressed in terms of e(IRI,I71) 
according to formulae (32.10) and (32.11), they can be interrelated. 

That is, since formula (32.11) represents the Fourier transform of 
(sgn 7-e), then taking the inverse of this transform we have 


sent - e(|RI,IrI)= afer (k',co’) eK “R-w'r) dk'des’. (32.12) 


§32 DISPERSION RELATIONS 543 


Substituting the value of e(|RI,I7|) from (32.12) into (32.10), we find 





e| (k,w) = i dr | d'f dk'f dR e™(k' co’) sgnz eilk'-k)-R ¢-i(o-w"r 
Eee 





wat 7 faw f ak’e™(k'00') f e-(w-wr senz dr fetk-k)R dR 
Ln -20 


= faw far sent eilw -w'r [Emko 6(k—k')-dR = 


= z Sio em(k,w') J e~ilw-o sgnr dr. (32.13) 


-00 


The inner integral can be calculated by making use of formula (111.20) for 
the sign function (cf. Volume 1, Appendix HI) 


oo 
L E a oy) 
an J sent e dr Ay FEIE 


Hence 
1 eM(kw') , , 
re == j5 |f eso) 
e®(k,w) Pf T dw. (32.14) 


The integral in formula (32.14) is taken in the sense of a principal value. 
Analogous calculations — the inversion of formula (32.10) and the substitu- 
tion into (32.11) — give 


ss TE; f 
DE 4p Ke) des’. (32.15) 
(t) Co 


Formulae (32.14) and (32.15) are called the Kramers—Kronig formulae or 
dispersion relations. They relate the real and imaginary parts of the dielectric 
constant, in other words, the characteristics of the processes of dispersion 
and absorption. All the above calculations can be applied to the constitutive 
equation of u(k,a), and the same relations can be written for the real and 
imaginary parts of the magnetic permeability. Thus, the general principle 
according to which the real and imaginary parts of the basic quantities which 
characterize the properties of a medium — the permittivity e(k,w) and the 


Di A BFI 


544 HIGH-FREQUENCY FIELDS CRES 


permeability u(k,w)—are correlated is established. Since, according to 
(31.12), the electrical conductivity is connected with the dielectric constant, 
the result obtained holds equally for electrical conductivity. 

The Kramers—Kronig formulae are among the most general relations of 
electrodynamics. Indeed, in deriving them only two assumptions have been 
made: 

(1) the existence of the causal relationship: the value of the function D at 
an instant ¢ can be determined by polarization processes which occurred only 
at preceding instants of time, 

(2) the possibility of the expansion of all the functions in a Fourier integral. 

The latter assumption is in practice fulfilled for all functions describing 
physical processes. Thus, the Kramers—Kronig formulae establish the relation 
between quantities which characterize dispersion processes (the real part of 
e€, u or a) and absorption processes (the imaginary part of these quantities) 
in the most general form. Dispersion processes and absorption processes turn 
out to be interrelated. Illustrations of this general proposition are encountered 
in many different branches of physics, in particular in quantum mechanics 
and the theory of elementary particles (see Part V). Besides the importance 
of this principle, the Kramers—Kronig formulae are of great practical im- 
portance. The imaginary part of the dielectric constant is connected with the 
absorption of energy in the medium (see (31.13)) and can be found relatively 
simply by experiment, while the real part e™® is found from the Kramers— 


Kronig formula. 


§33. The electromagnetic field in a medium with spatial and time dispersion 


The solution of Maxwell’s equations which we have already obtained in the 
form of transverse electromagnetic waves turns out to be not only qualita- 
tively but also quantitatively inapplicable in the case of the propagation of 
electromagnetic waves in a medium with spatial dispersion. 

As we have emphasized in §6, the phenomenon of spatial dispersion 
proves to be important in the case where the wavelength becomes comparable 
with the size of the spatial non-uniformities of the medium. Then the value 
of the induction at a given point r of space turns out, according to (6.5), to 
be dependent on the value of the field vector and the properties of the 
medium in the region of space surrounding it (the whole set of points r’). 

In this case one usually speaks of the “non-local connection” between the 
corresponding quantities D and E. 

Until recently it was assumed that in macroscopically homogeneous media 


§33 SPATIAL AND TIME DISPERSION 545 


the phenomenon of spatial dispersion could be disregarded. However, it turns 
out that spatial dispersion plays an important role in a very large class of 
macroscopically homogeneous media called plasma-like media. This class 
comprises the plasma itself (whose properties will be discussed in ch. 6) as 
well as conductors (metals) and semi-conductors with a high electric conduc- 
tivity, when one has to investigate their interaction with high-frequency fields. 

In such media there are characteristic non-uniformities of a relatively 
large scale. As an example one can mention the mean free path of electrons 
in metals, which will be calculated in Part VI. The mean free path amounts 
to about 7 X 1076 cm — 1075 cm. 

Another example of spatial non-uniformity in connection with a plasma 
will be presented in § 46. 

The important part played by plasma-like media in contemporary physics 
has made the study of the properties of media with spatial dispersion very 
relevant. 

Let us consider a macroscopically homogeneous, and isotropic, non- 
magnetizable (u~1) medium without currents or free charges. We seek a 
solution of Maxwell’s equations in the form of expansions in a Fourier 
integral. This method has now become one of the most popular and fre- 
quently used methods. We write the expansions in the form 





coal i(K-r-w r 
MO fei r-wi) D(k,w) dkdw , (33.1) 
E(r,1) = feilk 1-00 E(k os) dkdw , (33.2) 
(21)4 
Bi, =~ felkt-09 Beko) dkdw , (33.3) 
(2n)4 
aaa i(K*r-w 
H(r,1) ane ff ekr-w) H(k ww) dkdw . (33.4) 


Substituting these expansions into Maxwell’s equations, we find 


i (kXH(k,w)] = sie D(k,w) , (33.5) 


i [kXE(k,w)] = ie B(k,w) , (33.6) 





546 HIGH-FREQUENCY FIELDS Ch. 5 
k-D=0, (33.7) 
k-B=O. (33.8) 


Thus the problem of integration of the system of equations in partial deri- 
vatives is reduced to the problem of the solution of a system of algebraic 
equations and the subsequent inversion of the Fourier transformation 
formulae. 

In order that these relations assume a practical meaning, one has to write 
constitutive equations for D(k,w) and E(k,w), and B(k,w) and H(k,w) 
respectively. 

We introduce the dielectric constant connecting D(k,w) and E(k,w). We 
note that, since the dielectric constant depends on k, even for an isotropic 
medium it is a tensor and not a scalar. 

That is, since it can depend on the vector k, and the direction of the 
vector k is the only direction defined in the isotropic medium, in order to 
determine the dielectric constant we have to form a symmetric tensor of the 
second rank containing only the tensor 6;; and the components of the vector k. 

The only second-rank symmetric tensor €;; which can be made up of these 
quantities has the form 





€,(k,w) = e(k,w) | ô pest raa 33.9 
ij(K,œ) = €)(K,w) | ô;j 2 | a (33.9) 


If the direction of the wave vector is chosen as the z-axis, then one can write 
for Ejj the explicit expression 


qa 0 0 
Gy = 0 Ej O |). (33.10) 
0 0 Ell 


An isotropic medium with spatial dispersion is characterized by two dielec- 
tric constants: the longitudinal one Ell and the perpendicular one ej. The 
reason for such a terminology will be clear from what follows. It goes without 
saying that ej and eq are functions of k and w. 

If spatial dispersion is absent, then the dielectric constant e(t—t') depends 
only on the time. Correspondingly, ¢;; in (33.9) must depend only on w and 
not on k. It can then be assumed that 


§33 SPATIAL AND TIME DISPERSION 547 
ej(w) = ew) = E(w) . (33.11) 
Then 
Ejj = E(w) jy - (33.12) 
The constitutive equation for the Fourier components will have the form 
Dk,w) = €;(k,w) E(k.) - (33.13) 
Forming the vector product of (33.6) and k, we have 
kX (kXE) = k?E- k(k-E) = — = (kxB). 


Substituting into the right-hand side its value from (33.5), we find 
w2 
k?E— k(k:E) = — D, (33.14) 
c 


or 


2 
-2 fa) 
WEB y — kb) => D 


i: 


Making use of the constitutive equation (33.13), we finally obtain 


2 w? 
k GD alr Eij E;=0. (33.15) 
Formula (33.15) determines a system of linear algebraic equations. In order 
that this system should have a non-trivial solution it is necessary that the 
following condition be fulfilled: 


2 
2 OP gS 
ki; — kjk; - a G= 0. (33.16) 


This expression determines the dependence of e; on K and a), i.e. represents 
a dispersion equation. We note, first of all, that taking into account the 
definition (33.9) and choosing the direction of the vector k as the z-axis 
(k,.=k,,=0, k,=k), one can write the equation in the form of a system of 
algebraic equations 


ee i 


548 HIGH-FREQUENCY FIELDS GAS 
w2 
=> Eko) — k2=0, (33.17) 
C. 


€| =, =0. (33.18) 


Eqs. (33.18) and (33.17) are independent dispersion equations for electro- 
magnetic waves which can propagate in a medium with spatial dispersion. We 
see that in such a medium the existence of two independent wave processes 
is possible: transverse waves, for which the dispersion law (33.17) is valid, 
and longitudinal waves, for which the dispersion law is given by formula 
(33.18). For transverse waves the electric field vector Ej has components 
E, and E, which are different from zero; for longitudinal waves the field 
has only one component: £,. In order to avoid misunderstanding, we point 
out that for longitudinal waves the condition (33.7) is fulfilled, so that the 
vector D is perpendicular to the vector k, i.e. to the direction of propagation. 
However, the vectors D and E, by virtue of (33.14), are not parallel to each 
other. The appearance of longitudinal electromagnetic waves, which are 
often called polarization waves, appears to be a specific effect associated 
with the spatial dispersion of a medium. Longitudinal electromagnetic waves 
in a medium with spatial dispersion have a simple meaning. Consider a 
medium with an equilibrium but non-uniform charge distribution and assume 
that such a medium is acted upon by an electromagnetic field with a wave- 
length which is comparable to the size of the non-uniformity. The field 
causes a displacement of the charges (i.e. produces a polarization), violating 
the equilibrium distribution. As a result, charge oscillations which are very 
similar to elastic sound waves in isotropic media arise in the medium. 

A more concrete quantitative description of the propagation of longi- 
tudinal waves will be developed in §46 for the case of a plasma. There we 
shall obtain explicit expressions for ej and ej and, by means of eqs. (33.18) 
and (33.17), tie law of dispersion in a plasma, also obtain w (k) and w(K) 
for transverse and longitudinal waves. 

In the case of a spatially homogeneous medium the dispersion equation 
goes over into the equation 


w2 
5 oa, (33.19) 
c 


which is the same as (31.11) and which determines unambiguously the wave 
frequency cw(k). 


§34 DISPERSION OF LIGHT 549 


Since, by virtue of (33.11), in such a medium ej = e(%w) #0, the only 
possible solution of eq. (31.6) reads 


£), =£,=0. 


Thus, in correspondence with the results of §31, only transverse waves can 
propagate in a medium without spatial dispersion. 

We note that we have obtained the dispersion equations and other basic 
properties of waves without finding the field vectors as functions of the 
coordinates and time. The latter require the inversion of the Fourier integrals, 
which, according to (33.1) — (33.4) and (33.13), can actually be carried out 
if the dispersion law is known in explicit form. In practice it is often just the 
dispersion law which is of basic interest. The calculation of field vectors 
according to formulae (33.1) — (33.4), which presents a major difficulty, 
can often be avoided. This fact represents the major advantage of using the 
Fourier integral method. 

In conclusion we shall find the relation between the tensor e; and the 
electrical conductivity tensor Ojj- 
Since the relation 

dD_3dE 4r. 
ay j 
O A G 


is of universal character, then, substituting into it the expansions of the 
corresponding vectors in the Fourier integral and making use of the ex- 


pression (33.13) and the definition of the electrical conductivity tensor we 
obtain 


41057 
Ejj = O77 _ ese F (33.20) 
This formula is a generalization of formula (31.12) which was obtained for 
the case where spatial dispersion is absent. 


§34. The dispersion of light 


We have discussed the effect of the scattering of electromagnetic waves by 
free and bound electrons in §36 of Part I. We shall now consider this effect 
from a somewhat different point of view. Namely, we shall calculate the 





550 HIGH-FREQUENCY FIELDS Ch. 5 


dielectric constant of a medium containing electrons which scatter the 
radiation. 

As the simplest medium we shall consider a rarefied gas in which the 
polarization P is equal to 


P=ANd, 


where N is the number of charges which produce the scattering, and d is the 
dipole moment acquired by each of these. According to (36.2) of Part I, for 
d we have 


E e2? E 
m w -w + iwy i 
where œwpọ is the natural frequency of the oscillator. 
The field Ein a rarefied gas is equal to the external field. Hence 


2 
D=E+ 4nP= E(1+4nv l (34.1) 


tk ea) 
and, consequently, making use of the definition (31.13), one can write for the 
dielectric constant 


(34.2) 


2 
espa ot e A 


w —wr+ iyw 

For a sufficiently rarefied gas the second term of (34.2), proportional to 
the number of electrons per unit volume, is small in absolute value, compared 
with unity, at all frequencies. Hence one can write approximately 


2ne2 N 


m 2 2 


n—-ik~1+ Sa ee 
wp — + iyw 


Separating the real and imaginary parts, we find 


2. 2 
2 Wr —- Ww 
m= 1 4 ne N 0 A (34.3) 
M (ww?) + yw? 
0 
2Ņ 
ere N p= (34.4) 


mi (0 0y 


§34 DISPERSION OF LIGHT S551 


The formulae obtained above determine clearly measurable quantities — 
the refractive index and the absorption coefficient of the radiation. If we 
took a system of oscillators with natural frequencies Wol: W02» «++» Wi» -> aS 
a model of the gas, then in formulae (34.3) and (34.4) we would have to carry 
out the summation over different kinds of oscillators. 

Let us discuss the dependence of n and k on the frequency of the radiation 
in more detail. We assume, first of all, that w differs sufficiently from the 
natural frequencies woj- Then, neglecting y2% in the denominator, we obtain 


2 N; 
n2—«2= 14500 le (34.5) 
dy Se 





_ 4ne2 sr Niy 


? 2 Deh 
4 i lwg") 


2nk 





(34.6) 


We see that the absorption coefficient K in the region w # wg is very small 
and that for n one can write 


Fme Ni 
nal mate s57 : (34.7) 
7 oT 


Formula (34.7) expresses the dispersion law in a rarefied gas. The de- 
pendence of the refractive index on the frequency is shown in fig. 1V.14. As 


I 
1 
I 
I 
[ 
1 
[j 
1 
























+ 
si 
+ i 
| l 
l l 
1 | 
! | 
| 
1 w 
Lh ee a e 
Wy We wy 


Fig. IV.14 


Le 





552 HIGH-FREQUENCY FIELDS Ch. 5 


@ approaches wp» 1 increases sharply. For values of w which are greater 
than a, the index of refraction takes on a very small value (see below) and 
again increases with frequency. Each substance has its proper set of charac- 
teristic frequencies wo;. As an example we have in fig. IV.14 restricted 
ourselves to three frequencies. 

We have already pointed out in Part I that real microscopic radiators, 
atoms and molecules, do not obey the laws of classical electrodynamics. 
Nevertheless, the harmonic oscillator, as a classical model of radiating sys- 
tems, does yield a number of the basic properties of real emitters. This is 
seen particularly clearly in the example of dispersion. In quantum mechanics 
(see §108, Part V) it will be shown that the dispersion law is expressed by 
the formula 


2 i 
ne aes r 


(34.8) 
m F w? E 


>? 


where the w,’s are the transition frequencies connected by the relation 
wp =(E,—E)/h between the rth and the lowest energy levels of the emitter, 
and the f,’s are coefficients satisfying the condition 2) if Me 

The resemblance between the accurate formula (34.8) and the formula 
based on the classical model (34.7) is obvious, although the quantities con- 
tained in the latter have a completely different meaning. 

In the case of high frequencies satisfying the condition w>w,, the 
quantum formula and the classical formula for the refractive index both 
assume the form 


2 
2ne~ N (34.9) 


The refractive index turns out to be smaller than unity and to be independent 
of the properties of the substance, except for the number of electrons per 
unit volume. The meaning of this result is very simple: for the scattering of 
sufficiently hard radiation the binding of electrons in atoms ceases to play a 
significant part. Atomic electrons scatter the radiation just as free electrons. 
The refractive index n defined by formula (34.9) appears to be that of a 
medium which represents a gas of free electrons with density VV. Formula 
(34.9) is applicable in the ultraviolet region to elements which are at the 
beginning of the periodic system and have a small binding energy for the 
electrons in the atoms, and it is applicable in the X-ray region to elements 
of the middle part and end of the periodic system. 


§34 DISPERSION OF LIGHT 553 


A medium with n<1 is optically less dense than vacuum. The phase 
velocity of electromagnetic waves in it is v=c/n>c (see §7 of Part II). 

When sufficiently short electromagnetic waves are incident on the surface 
of a body, the phenomenon of total internal reflection (which will be 
described later, in §35) can occur. 

It is interesting to calculate the group velocity of electromagnetic waves 
in such a medium 


yp, de. (dk) 
& dk dw A 


We calculate dk/dw making use of (34.9): 


Doe 
dk_dw_lid (con) =2 (nto) = 2 (1 +20 ae 
c dw c c 








dw dw v dw mi2 
Hence 
Uv. = g ~e(I amet) = en 
E 2ne2N mw? 
j RE 
mw? 


The group velocity, as is to be expected, turns out to be smaller than the 
velocity of light. 

We now come back to the general case of dispersion and consider the 
frequency range w ~ wp. In this case formula (34.5) is inadequate and one has 
to return to expressions (34.3) and (34.4). For w ~= wg one can write that 


2 DAS 
Wo — wr & (wy—w) 2W6 - 


We introduce a new variable 


Then 

















554 HIGH-FREQUENCY FIELDS Ch. 5 
x 
E 
x 
na 
> 
x 
Fig. IV.15 
_2ne2N 1 


MYwg 1 +x2— 


The curves k(x) and n(x) — 1 are shown in fig. IV.15. We see that maximum 
absorption occurs for x = 0 (i.e. w = wù). This absorption or, more precisely, 
attenuation, is associated with a large value of the cross-section for scattering 
(see §36 of Part I). It decreases sharply with increasing |x|, reaching half of 
its maximum value for x = +1. The refractive index in the region —1 <x < 1 
displays a dependence on x which is completely different from that for 
x| > 1 (shown in fig. IV.15). For |x| > 1, n increases with increasing x, while 
for x = F] it reaches respectively its maximum and minimum values. In the 
interval —1 <x <1, n(x) is not an increasing but a decreasing function of 
x, reaching the value n= 1 at x = 0. This region, in which the index of refrac- 
tion decreases with increasing frequency, is called the region of anomalous 
dispersion. In the region of anomalous dispersion a rarefied gas possesses 
its highest absorption (minimum transparency). 


§35 GEOMETRICAL OPTICS 555 


We shall come back to the problems of dispersion in quantum mechanics 
where it will be shown that besides scattering without a change in frequency 
scattering with a change in frequency (so-called Raman scattering) is possible. 


§35. Geometrical optics 


In the above we have found the law of the propagation of plane electro- 
magnetic waves in homogeneous transparent media. A characteristic of a plane 
wave is the fact that its equiphase surface is infinitely large and plane. 

Plane electromagnetic waves are particularly simple because their ampli- 
tude and wave vector remain constant in space and time. It turns out that, 
under certain conditions and with a certain degree of accuracy, analogous 
properties are possessed by arbitrary electromagnetic waves. As will be shown 
below, it is necessary for this that the curvature of the wave surface is suffi- 
ciently small over spatial regions which are large compared with the wave- 
length. 

The approximate replacement of the wave surface by a plane is of very 
great practical importance, since in the case of an arbitrary form of the surface 
the laws of propagation of the waves turn out to be very complex. 

From the aforesaid it is clear that such a substitution is allowed, in any 
case, if the wavelength is very small. Therefore the limiting case of electro- 
magnetic waves whose wavelength A — O deserves particular consideration. 

In practice the region of visible light already corresponds to this case. 
Indeed, the wavelength of visible light is about 5 X 1075 cm. It is always very 
small in comparison with the size of macroscopic bodies and optical devices. 
For this reason the study of the laws of propagation of electromagnetic waves 
in the limiting case À > 0 is called geometrical optics. 

We shall below, for the same reason, sometimes call an electromagnetic 
wave a light wave, and we shall speak of the propagation of light. This should 
not lead to misunderstanding. 

In a transparent medium each of the components of the field vectors E 
and Hsatisfies the wave equation 


a2F 


VUA 
c? dt? 


Here F denotes an arbitrary component of the field. 
In a bounded medium, in addition to the wave equation the components 
of the field vectors must also satisfy the boundary conditions. Therefore the 


a a a 





556 HIGH-FREQUENCY FIELDS Ch. 5 


components of the vectors E and Hare connected with each other by certain 
relations and are not quite independent. However, at a sufficiently large 
distance from the interface of media this fact can be disregarded and all the 
components of E and H can be considered as independent. Then the wave 
field can be characterized by the scalar quantity F. 

Considering a monochromatic wave, we can write 


F= f(x,y,z) et . 


The function f obviously satisfies the equation 
Ew? 
V2f+——f=0. 
c2 


1 
Introducing the wave number k = w/c and the refraction index n = €? , one 
can rewrite the above equation in the form 


V2f+n2k2f=0. (35.1) 
To a plane wave there corresponds the solution of eq. (35.1) 
f= ae KD gekkon = 4 ikl , (35.2) 


where a is a constant amplitude, and /=n(Ky-r) is a quantity which is 
usually called the optical path length and is a linear function of coordinates. 
It differs from the geometrical path by the factor n. 

We have to find the solutions of eq. (35.1) in a general form, without 
confining ourselves to a plane wave. However, we shall consider the limiting 
case A> 0 or koe. We shall try to write the solution of eq. (35.1) in the 
limiting case of very large k as 


f=aeweyz) , (35.3) 


Here the quantity y, called the eikonal, has a form which approaches as 
closely as possible that of the phase of the plane wave 


Y= kL (x,y,z), 


where £ (x,y,z) is a function of the coordinates which is sufficiently close to a 
linear function. Thus, we shall try to find the solution of (35.1) in the form 


§35 GEOMETRICAL OPTICS 557 
f=a ekL&,y,z) | (35.4) 
We calculate the derivatives of the function f: 
vf =aVekl = ika ekl ye 
V2f= V-Vaekl = ika ekl V2 L- k2a ekl (y£)? . 
Hence eq. (35.1) assumes the form 
ika eL vie — k?a eL (V L)? +n?k?a ekl =0, 
or 
(VL) —G/K)V2L =n?. (35.5) 
The quantity £ does not depend on k. Therefore, if for k > œ the inequality 


VLI? > KIVE , (85.6) 


is fulfilled, then one can pass to the limit k > œ in eq. (35.5) and drop the 
second term. Then eq. (35.5) assumes the form 


IVL =n2 , 
or 
(Vy)? =n2k2 . (35.7) 


In the coordinate representation this last equation has the form 


Ee) T ak k e SAA (35.7') 


Eq. (35.7) is called the eikonal equation. In deriving it we have not made 
any assumptions about the values of the refractive index n, which can be an 
arbitrary function of the coordinates. 

Knowing the solutions of eq. (35.7), we find an approximate solution of 
the wave equation (35.1) with the assumption that k is a sufficiently large 
number. In the limit k > this solution is the same as a plane wave propagat- 


558 HIGH-FREQUENCY FIELDS Ch. 5 


ing in the direction of n. The condition of applicability of eq. (35.7) is (35.6). 
Later we shall discuss the question of the physical conditions to which this 
inequality corresponds and when it may not be satisfied. 

If y is a solution of the eikonal equation, then 


Y = k£ (x,y,z) = const 


is the equation of an equiphase surface. The propagation of electromagnetic 
waves takes place in the direction of the normal to the constant-eikonal 
surface, i.e. in the direction of the vector VL. This direction of propagation 
of waves is called the light ray direction. 
Over a small region of space the eikonal y can be expanded in a series, 
writing 
oy ðL 


Uam k O A (35.8) 


For f, according to (35.3), we have 


f=ackVLyr= q ekl, (35.9) 


The optical path / turns out to be equal to (VL) -r. 

Comparing (35.9) with (35.2), we see that, within the framework of the 
applicability of geometrical optics, over a small region of space each wave 
can be considered as a plane wave with its wave vector directed along the 
normal to the constant-eikonal surface. In particular, for a homogeneous 
medium with a constant refractive index, it follows from (35.7) that 


Vw =const. 


This means that adjacent equi-eikonal surfaces are at equal distances from 
each other. The normals to these surfaces are straight lines. This statement 
has a simple meaning: in a homogeneous medium light rays are propagated 
rectilinearly. 

We shall not dwell on the application of geometrical optics to optical and 
electro-optical systems *. We shall discuss only the question of the meaning 


* See, for example, L.D.Landau and E.M.Lifshitz, The classical theory of fields 
(Pergamon Press, London, 1962) §55—57; M.Born and E.Wolf, Principles of optics 
(Pergamon Press, London, 1959, 1965). 





§35 GEOMETRICAL OPTICS 559 


of the inequality (35.6) which serves as the condition of applicability of the 
approximation of geometrical optics. 

As is known from geometry, the quantity V2 determines the mean 
radius of curvature of a slightly curved surface y = const. Thus, the ine- 
quality (35.6) has a simple geometrical meaning: it means that the mean 
radius of curvature of the constant-eikonal surface must be large in compari- 
son with the wavelength A. In other words, the light perturbation must vary 
smoothly from point to point. The conditions of applicability of the approxi- 
mation of geometrical optics are undoubtedly violated in the immediate 
vicinity of the emitter. Here the variation of the wave field from point to 
point is very rapid. The quantity V?y can be of the same order of magnitude 
as the wavelength. 

Consider a second important case where it turns out that it is impossible 
to make use of the concepts of geometrical optics. 

Let a completely non-transparent body, or screen, be placed in the path 
of light rays. In the approximation of geometrical optics the light rays do not 
penetrate the region behind the screen (fig. 1V.16). Behind the Opaque screen 
there is a region of shadow. 

Let us consider the boundary separating the illuminated region and the 
shadow (straight lines AB and CD in fig. 1V.16). Here a discontinuous change 
in the amplitude from the value a to zero occurs. Hence the derivatives of f 
with respect to the coordinates become infinite at the boundary between 
light and shadow. Here the conditions for applicability of geometrical optics 
are again violated. In the next paragraph we shall discuss physical phenomena 
which arise in this case. 


Fig. 1V.16 


eT a2 te 7 ae 





560 HIGH-FREQUENCY FIELDS ChS 
§36. Diffraction 


As we have just shown, at the boundary separating the illuminated and the 
shaded regions the approximation of geometrical optics and the concept of 
rectilinear light rays becomes inadmissible. 

Naturally, there cannot exist a sharp boundary between these regions. 
Near the edge of the non-transparent screen the propagation of light cannot 
in reality be described by a simplified wave equation. The solution of the 
correct equation, taking into account the boundary conditions at the surface 
of the screen, leads to an intensity distribution corresponding to a gradual 
transition from the region of total illumination to that of shadow. At the 
boundary between the illuminated and shaded regions departure from the 
rectilinear propagation of light occurs, as if the rays were bent. This is called 
diffraction. 

The solution of the boundary value problem of diffraction turns out to be 
very complicated. Therefore special methods for a simplified calculation of 
diffraction phenomena have been developed. We shall restrict ourselves to the 
discussion of the most important and at the same time simplest case. 

Leta plane parallel beam of light be incident on a screen having an opening 
of arbitrary shape. The light beam comes from a source of intensity 7g which 
is distant from the screen. At the edge of the screen diffraction of the light 
occurs. We are interested in the light intensity distribution at a large distance 
from the opening (large compared with the size of the opening). Diffraction 
in the case where both the light source and the point of observation are 
located at infinitely large distances from the aperture which produces the 
diffraction, or from the screen, is called Fraunhofer diffraction. 

For Fraunhofer diffraction the waves which fall on the screen and those 
which pass through the opening at a large distance from its edge can be 
assumed to be plane waves. We shall call the wave incident on the screen 
the primary wave. This wave, falling on the opaque screen, is completely 
absorbed. However, part of it passes through the opening in the screen. The 
Opening in the screen becomes an emitter of electromagnetic waves into the 
space behind the screen. 

Each point of the area of the opening is a source of secondary spherical 
waves which reach the point of observation with a certain retardation. 

If the value of any component of the field at a point ro located in the 
area of the opening is fy = const e™t, then the perturbation coming from 
this point to the point of observation N will be of the order of magnitude of 
R7 f(r) = const ewt, where T=t— R/c is the delay time (see §23 of 
Part I). 


§36 DIFFRACTION 561 


The total field at the point of observation is obtained on the basis of the 
principle of superposition by summing over all the emitters, i.e. over all 
points of the opening in the screen. 

Thus, at the point of observation N we have 


~iw(t-R/c) F iwR/c 
In = const ds = const - eit f£ R ds = 





è ikR 
= const - e~iwt | S — 


(36.1) 


Here f denotes any component of the electric or magnetic field. In the same 
approximation as in the preceding section we ignore polarization phenomena 
and assume all components of the field to be independent. 

Formula (36.1) is of a general character. In calculating the integral (36.1) 
we shall bear in mind that the distance to the point of observation is large 
in comparison with the size of the opening (Fraunhofer diffraction!). 

We introduce an origin, locating it at a point of the opening (fig. IV.17). 
Then one can, as usual (see, for example, §15 or §26 of Part I), write 





Fig. 1V.17 


ae Ore eee —™S 





562 HIGH-FREQUENCY FIELDS Ch. 5 
R= Irori ~ro noT, 


where Ng = ro/ro is the unit vector in the direction of rọ. 

The quantity |r|, which varies within the limits of the opening, is small in 
comparison with |r|, but may be large compared with the wavelength. There- 
fore the term no'r can be dropped in the denominator of formula (36.1) 
but must be retained in the exponent. Hence we have 


pees exp [—iwr] exp [ikro] J exe [-iknyT] ds=A f exp [—iq-r] ds, 
0 (36.2) 





where q= kno, and A denotes the whole set of factors in front of the integral. 
The value of this constant will be expressed in terms of the intensity of the 
incident wave. 

Choosing the plane of the opening to be the plane z = 0, one can write 
the quantity q- rin the form 


Xo Yo 
Gms ax apy k x E y= kx cosa t ky cosh- (36.3) 
0 0 


Here cosa and cos are the direction cosines of the vector ng. 
If the wave vector of the incident wave is normal to the plane of the 
opening and the angle of deflection is small, then it is easily seen that 
cosa sin#, +0, , 


cosB ~ sinf S03, 


where 0} and 9, are the angles of diffraction in the xz-plane and yz-plane. 
In this case 


qi ~ k0] ` q2 ~ kla A (36.4) 


The integration in (36.2) is carried out over the area of the opening. If 
one introduces the function 


1 for (x,y) within the limits of the opening 


E(x,y) = 
0 for (x,y) outside the limits of the opening, 


§36 DIFFRACTION 563 


then one can write (36.2) in the form 
co 
f= AL f ex.) exp [-i(qyxtq2v)]dx dy (36.5) 
-co 


where the integration is carried out over the entire area of the screen (in- 
cluding the opening). 

In order to determine the constant A we note that (36.5) represents the 
Fourier integral of the function £(x,y). Hence 


1 


Ey) = (QnA 





exp Lia xta2v)1 (41,42) dq; dq - 


From formula (II.9) it follows that 


2 pa soles te 2 
Sige»)? dxdy ang Jiflay.a2)I? da; aay - 


By the definition of ¢ the integral on the left is equal to unity multiplied by 
the area S of the opening. The integral f Ifa a42)? dq; dq, by definition 
represents the total intensity behind the screen. The latter is equal, obviously, 
to the total intensity of light passing through the opening in the screen, i.e. to 


fI dqy da = 105, 
where Ío is the intensity of the incident light. Hence we find the value of A: 
A= i /2n (36.6) 


and, finally, we have 
£041.42) = C3 /2m) f f exp ilata) E@.y) ddy. (36.7) 


Formula (36.7) gives the general solution of the problem posed; the distri- } 
bution of the wave field behind a screen having an aperture at a large distance 
from it. 

The intensity distribution is given by the quantity 


564 HIGH-FREQUENCY FIELDS Ch. 5 


d = |f(41.42)I? dq da3 - (36.8) 


From the meaning of the quantities q} and qo it is clear that \fl2dq 1 dq> 
represents the intensity of the diffracted light for given values of the diffrac- 
tion angles 8, and 03: 


dl = [f(a 1.42)? kK? dQ. (36.8') 


Up to now we have considered diffraction from a small opening in a screen. 
We now assume that light falls on a small screen whose size and form are the 
same as those of the opening in the screen previously considered. Such a 
screen is called a complementary screen with respect to the opening. The 
diffraction at the edge of the complementary screen is easily found from 
the following considerations. We match the complementary screen with the 
opening. Then behind the new screen, which is continuous, there will be no 
field of light. This latter fact can be expressed as follows: as a result of the 
superposition of the fields diffracted from the opening and from the com- 
plementary screen the total field is reduced to zero. Hence it follows that 
the field behind the complementary screen can be connected with the field 
behind the opening by the relation 


Seomp! safa 


or 


The opening and its complementary screen give identical intensity distribu- 
tions of the diffracted light. This proposition is called Babinet’s principle. 

We shall apply the general expression for the intensity distribution to three 
cases which are of great practical importance. 

The first of these is the diffraction at an infinitely long slit in an opaque 
screen. Let the width of the slit be a. We locate the origin in the middle of 
the slit, and direct the y-axis parallel to its edge. The direction of the incident 
light coincides with the direction of the z-axis. It is obvious that the diffrac- 
tion comes only from the two edges of the slit, i.e. in the integral (36.7) one 
has to integrate only with respect to the coordinate x. For an infinitely long 
slit, no diffraction of rays falling on the screen occurs in the direction of y. 
Hence .we shall refer all quantities to unit length of the slit. In the one- 
dimensional case a calculation analogous to that which has been carried 


§36 DIFFRACTION 565 


out before gives the value A = (I[o/27)?. Formula (36.7) in the one-dimensional 
case has the form 

z4 

f= (Io/2m) if e-iax dx = (2Ip/m)! 


+a 


sin 3aq 
q4 





sin kað 


= (2UlolT): i 


(86.10) 


The intensity of light diffracted at an angle between 0 and 0 + dO is equal to 


2 sin? +kad Pec Ika? sin? kað 
m(k0)2 2m ($ka0)? 


0 





(36.11) 


The quantity /pa@ represents the total intensity of light incident on unit length 
of the slit. 


As a result of diffraction, a system of light and dark bands arises parallel 
to the slit. 

The intensity distribution as a function of x = 4ka0 is shown in fig. IV.18. 
In the direction 0 = O there is a principal maximum of the intensity. At points 





o12t 27 3n 


Fig. IV.18 


566 HIGH-FREQUENCY FIELDS Ch. 5 


$ka = 4(nt1)n (n=1,2,...) the intensity has a number of secondary maxima. 
In these maxima the intensity is considerably lower than in the principal 
maximum, and decreases with increasing order of the maximum (the number 


n). 


One can in an analogous way find the diffraction from a circular opening 
of radius a in the screen. We locate the origin at the centre of the circle. Then 


a 20 


= Cr /2m) if If eit; drdy = 
0 Oo 


a 2n a 2n 
= (13 /2n) dL f eriar cose r drdo = (I /27) f rdr efi eiar cosy dy, 
0 0 0 0 


From the theory of Bessel functions it is known that 


2m 
ff er cosy dy = 2nJy (ar), 
0 


where Jo is the Bessel function of zero order *. Hence 
-a 
f= 15 if Joar) rdr. 
0 
In the theory of Bessel functions it is proved that 
Ç a 
if Jalar) rdr =a J; (qa). 
0 


Hence, finally, 


34 
=J} 2 
if I3 CNE (36.12) 
* This formula is most simply obtained by expanding the exponent in a series and 
integrating term by term. See V.I.Smirnov, Course of higher mathematics, Vol. III, Part 2 


(Pergamon Press, Oxford, 1964). 


§36 DIFFRACTION 567 


The intensity of radiation diffracted into a solid angle dQ is equal to 


J? (kad) n _ Tī (kað) 
a7 AE a AE 
T 


dJ = |f|? k? dQ = Ipa? = 
where Igra? is the total intensity of radiation incident on the opening. 

As a result of diffraction at a circular opening, a system of concentric 
circles of illumination arises. 

The angular distribution of the diffracted radiation given by the function 
[? (ka0)] |? is very similar to the distribution shown in fig. IV.18. The 
height of the principal maximum is relatively greater, but the spacings be- 
tween secondary maxima are close to those which are shown in fig. IV.18. 

It should be recalled that identical formulae give the intensity distribution 
for diffraction from the complementary screens. Naturally there correspond 
to the intensity maxima for the diffraction from the aperture the maxima 
for the diffraction from the complementary opaque screen. 

In conclusion we shall dwell on the diffraction which occurs when light 
is passing through a set of N infinitely long thin slits which are located at 
equal distances from each other (a diffraction grating). 

Let the distance between the centres of neighbouring slits be d, and let the 
grating be oriented along the x-axis. The coordinates of the centres of the 
slits are x,, = nd, where the n are integers n = 0, 1, 2, ..., N. Then (36.7) gives 


N za y iNd 
1 i i i sete 
f= (o/27) a, e™™Xn asf e7aX dx =f; DD e7iqnd =fi lmit agi 
n=0 -34 n=0 l—e 


where f} is defined by formula (36.10). 
The intensity of radiation diffracted from the system of slits is given by 
the formula 


TAEL D A 2 

dI = | f1? dq ~ 1o (Em) (a a) dq . (36.14) 
sinzqd 2qa 

The interference pattern which arises corresponds to the superposition of the 


diffraction from different slits, but not to a simple sum of intensities. 


An important part is played in modern physics by the phenomenon of 
diffraction of X-rays (as well as electrons, neutrons etc.) by crystals (see 
Part V). 


eee 


568 HIGH-FREQUENCY FIELDS Ch. 5 


We shall discuss below the simplest theory of X-ray diffraction by crystals. 

Let a crystal lattice be formed by a set of atoms of the same kind (i.e. 
atoms which are identical in their properties). We assume that absorption in 
the lattice is absent. 

We further assume that an X-ray beam propagating in the direction n, falls 
on the lattice. The coordinate of the point of the crystal at which the mth 
atom of the lattice is located can be written in the form 


Im =latnb+pc, (36.15) 


where /, n, p are integers (positive and negative), and the vectors a, b, c are 
the unit vectors of the lattice (for more detail see § 109 of Part V; we restrict 
ourselves here to the case of the so-called primitive lattice). 

A wave with amplitude 


f= fp exp [-i(or-==™)]. 


c ži 


where f is an arbitrary component of the field vectors and fọ is a constant, 
arrives at the point r,,,. The atom which is located at the point r, scatters the 
wave, and a wave with an amplitude 


-o [= ey =n) 
IN REP. —i| wt — z a 


arrives at the point of observation N which is located at a distance R,,, from 
the mth atom. 
Assuming that 





Rm ~ ão- n Tim. 


where rọ is the distance from the point N to the origin, we find for the 
amplitude of the wave scattered by the mth atom 


fn = forg! exp [~iwt] exp [ikro] exp [~ik{r,n: (n-n;)}] - 


If one disregards the secondary action of the wave scattered by the mth 
atom on other atoms of the lattice, then one can write the total intensity at 
the point of observation in the form 


§36 DIFFRACTION 569 
2 
Th | DET [-ik{r mn m-n) |, (36.16) 


where the summation is carried out over all the atoms of the crystal (the 
numbers /, n, p). 
Taking into account the expression (36.15), one can write 


k[r n m-n;)] = kli[a(n-n;)] + kn[b-(n-n;)] + kp[c-m-n;)] = 
= /(a-(k—k,)] +n[b(k-k;)] + p[c(k-k;)] - 
Then each of the sums can be calculated directly. For example, 
Nı 


mi mi {k-k 
DD, exp [-il{a-(k—k,)}] 5 l xp [-iN;{a( DH 
1=0 


1 — exp [-i{a(k—k))}] 





where N} is the number of lattice points along the first rib of the crystal. 
Hence it follows immediately that 





sin? $N} [a-(k—k,)] sin? 3N2[b-(k—k,)] sin? 343 [c-(k-k,)] 
sin24[a(k—k,)]  sin?4[b-(k—-k,)] — sin?3[e-(k—-k,)] ` 
(36.17) 


Formula (36.7) is called the Laue formula. It is a generalization of the 
formula of the diffraction grating to the case of three dimensions with 
unequal spacings. 

The principal maxima lie in directions determined by a set of equations 
which are called Bragg’s conditions: 


a(n—n,) = 2mh,/k=h,d, 
b-((n—n,) = 2rh3/k= hd, (36.18) 
¢e(n—n,) = 2mh3/k=h3r, 
where hy, A, h3 are integers. 
Each of these expressions is the condition for the reduction to zero of the 


corresponding factor in the denominator of formula (36.17). 
It should be emphasized that for given wavelength and direction of the 








570 HIGH-FREQUENCY FIELDS Ch. 5 


incident beam n} the set of three equations (36.18) allows the determination 
of two independent components of the unit vector n. The system of eqs. 
(36.18) has a solution only for a definite wavelength. The reflection of X-rays 
from the crystal is said to have a selective character. The difference between 
one-dimensional and two-dimensional diffraction gratings and crystals lies in 
this fact. 

In practice, when the wavelength which can be selectively reflected from 
the lattice is unknown, then the crystal is either irradiated by a continuous 
X-ray spectrum or the orientation of the diffraction crystal (the direction of 
nı) is changed. In both cases the crystal itself “selects” conditions (the wave- 
length or orientation) under which the selective reflection is possible, i.e. 
under which the equations (36.18) are simultaneously fulfilled. 


§37. The reflection and refraction of electromagnetic waves at the boundary 
between media 


Up to now we have studied the propagation of electromagnetic waves in a 
homogeneous medium. We shall now consider the electromagnetic wave inci- 
dent on the boundary between two dielectric media which differ from each 
other in the values of their dielectric constants. For simplicity of calculation 
we shall assume that the magnetic permeabilities of the two media are equal 
to unity. 

We assume the boundary to be plane. It is clear that near the boundary the 
physical properties of the media may differ from those within their volumes, 
so that a thin transition layer exists at the boundary. However, if the thickness 
of the latter is small compared with the wavelength, its effect can be neglected 
and the boundary can be considered as a mathematical interface having no 
thickness. 

We assume that a monochromatic plane wave is incident on this surface, 
which we choose as the plane z = 0. For simplicity we assume that the vector 
k characterizing the direction of propagation of the incident wave lies in the 
xz-plane. We shall call the medium in which the incident wave is propagating 
the first medium. 

We write the vectors of the electric and magnetic fields of the incident 
wave in the first medium in the form 


Enc = Aexp [i(wf—k-r)] = Aexp [i(wt—kx cosa—kz cosy)] , 


Hite = Cexp [i(wr—k-r)] = Cexp [i(wt—kx cosa—kz cos y)] , 


§37 REFLECTION AND REFRACTION 571 
where cosq@ and cosy are the direction cosines of the wave vector. Since the 
wave is lying in the xz-plane, cos ß = 0. The wave number k is connected with 
the frequency by the relation 
1 
k= wej /c. 

At the interface the vectors EÙ and HÌ must satisfy the boundary 
conditions (5.2) and (5.4). The quantities E and H are not independent but 
are perpendicular to each other, and their absolute values are related by the 
expression (31.31). Hence two independent boundary conditions can be 
satisfied simultaneously only by assuming that the incident wave is partially 
transmitted into the second medium and partially reflected from the inter- 
face. We shall call the wave transmitted into the second medium the refracted 
wave, because its direction of propagation does not, as a rule, coincide with 
the direction of propagation of the incident wave. Let 

Erei = arf exp [i(k ;_krfl.y)] R 
Erefr = Arefr exp [i(coteft p_ retry] 7 


denote the electric vectors of the reflected and refracted waves. The total field 
in the first medium is characterized by the vector 


E, = Et Erf = A exp [i(wt-k-r)] + AT exp [i(w"fr-k™f-r)] . 
The boundary condition (5.2) at the interface of two media gives 
ine exp[i(wt—kx cosa)]+ A" exp [i(t kefy cosa kreny cos BrA)] 
= At exp Lilt r— xref cosa —krefiy cosp™®f)) . 


This equation must hold for arbitrary values of time ¢ and coordinates x and 
y. This is possible only if 


w = wrll = efi , (37.1) 
k cosa = kM cosqrhfl = kreft cos gift , 


cos BT = cosp"t = 0. 


eer ae - 


= y ~ 





572 HIGH-FREQUENCY FIELDS Ch. 5 


The first condition means that the reflection and refraction occur without any 
change in the frequency. The third equality shows that the reflected and re- 
fracted waves lie in the same plane as the incident wave. Finally, taking into 
account that k = w/v = e? w/c, one can rewrite the second relation in the form 


(e w/c) cosa = (e w/c) cos arh | 
(€ w/c) cosa = (è w/c) cos atf , 
whence we find that 
a=a™h, (37.2) 
ie. that the angle of incidence is equal to the angle of reflection, and 
Ose 12) i (37.3) 
coset \E 


Instead of the direction cosines cos œ and cosa one usually introduces 
sin@ and sinĝ0™®f, where @ and 0'°f are the angles formed by the incident 
and refracted waves with the normal to the plane z=0, and which are 
called the angle of incidence (0) and the angle of refraction (0™®f). For the 
latter we find 


1 
sin 6 (all 
SS Sap Sll— 7 (37.4 
singe 1? \ey ) 


Here 1 denotes the refractive index of thc boundary of the second medium 
with respect to the first medium. 

We arrive at the laws of reflection and refraction of light. However, the 
value of the refractive index turns out to be related to the dielectric constants 
of the media. In particular, if the first medium is vacuum so that €] = 1, then 


n=€ 


one 


is called the refractive index of the given medium. 

The expression for the refractive index justifies the terminology which we 
have introduced in §33. 

In considering the phenomena of reflection and refraction of electromag- 
netic waves, which had been extensively investigated for light even before the 


§37 REFLECTION AND REFRACTION 573 


establishment of its electromagnetic nature, the optical terminology is 
adopted. A medium with a higher refractive index is said to be optically more 
dense than a medium with a lower refractive index. 

Having established the directions of propagation of the reflected and re- 
fracted waves, we go on to the calculation of their amplitudes. Taking into 
account eqs. (37.1) we obtain, using the boundary condition (5.2), for the 
amplitudes of the reflected and refracted waves 


Atg + Agi = Ane, 


In order to find the two quantities AT! and A®*, it is necessary to have a 
second equation. This is given by the boundary condition for the magnetic 
field. We restrict ourselves to the case of normal incidence in the xz-plane, 
which requires no cumbersome calculations. For normal incidence we have 


E, = A exp [i(wt—kz)] , E, = EIRO 
4 : 
H,=-— €] A exp [i(wt—kz)] , He SH OF 


For the reflected wave we have, analogously, 

ET = Aref exp [i(wrtkz)] , E,=E,=0, 

Het = é All exp [i(wrtkz)] , H,=H,=0. 
H®"| differs from H, in sign, since the reflected wave propagates in the oppo- 
site direction and the projection He is oriented in the positive direction of 
the y-axis. For the refracted wave 

ertt = Arlt exp [i(wt—k™fz)] , E, =E, =0), 

jap =— e Atef exp [i(wt—k!tz)] , H, = H,=0. 


The conditions of the continuity of the tangential components of the vec- 
tors Eand H for z = 0 are written in the form 


A+Aril = qrefr 


e (4—48) = G Arefr 


R S T R LEE 


574 HIGH-FREQUENCY FIELDS Ch. 5 


Expressing the quantities A and A™f in terms of A, we find 


3 3 
— 
E e3 





i 
= 2 
peli, Ye (37.5) 
+(e)! ej +e} 
4 2ej 
arf = —A = E (37.6) 
1 + (€5/e€,)? ef + e3 


Formulae (37.5) and (37.6) represent a particular case (normal incidence) 
of the well-known Fresnel formulae derived by Fresnel in 1820 from the 
general concept of light as a wave motion. 

Knowing the amplitudes of the reflected and refracted waves, one can 
find the time averages of the energy fluxes reflected from the interface and 
transmitted into the second medium. Vhey are given by the quantities 


a 


1 1 1 1 L 
3 3 2 2 2 
i CENE CESE 2 €; —€ 2 
1 Cai 1 2 1 2 
| loh] aor [ErAprefy =~} =) A2= (Free) lol , 
2 2 
Si VED 
1 1 


qep o 2e \2 et / 2è \2 
refr =— 2 Etpe] =E 3 1 DA (ECA aE a 
(Cika | 4 [ ] Talla A IST lal , 
i T l \ej +6 EU AGT 


where o is the Poynting vector of the incident waves. 








The ratio 
1 dt 
} refl €? — E5 \2 l—n 2 
R=” Te a) =( ia) (37.7) 
lol \ 48 ltni2 
1 2 
is called the reflectivity, while the ratio 
refr €5\+ 2e? 2 
pe Sl -(2) (5) =1-—R (37.8) 
ISMN ee 


is called the transmissivity. 

Formula (37.7) shows that for €5 ~ e the effect of reflection of the wave 
from the interface is small. On the contrary, for € >e; or e; > e, (i.e. for 
ni2 > 1 or ny2 <1), R ~ 1 and the wave is almost completely reflected. The 
considerable reflection of the wave from a dielectric with a large dielectric 
constant is associated with the fact that in such a dielectric the incident wave 
excites a large displacement current. Because of this, screening of the external 





L Seeehe 


§37 REFLECTION AND REFRACTION 575 


electric field takes place, and its strength turns out to be close to zero. In the 
limit, as € > °°, the electric field vector has a node at the interface, while 
the magnetic field vector has an antinode. However, for any finite value of 
€ the transmissivity D for normal incidence has a small but finite value. 

In the case of incidence of the electromagnetic wave at an angle to the 
normal of the plane of the interface the expressions for the amplitudes 
(Fresnel formulae) become more complex. However, the general picture is 
not changed, with the exception of the case Ny2 <1 which calls for special 
consideration. 

If a plane wave falls at an angle 0 on the surface of an optically less dense 
medium, then for a sufficiently small value of 1 eq. (37.4) can only be 
valid for an imaginary value of the angle 0™ft. The limiting value of the angle 
of incidence @ for which 9° can still have a real value is determined by the 
condition 


sin 9 





meree (37.9) 


The angle 99 is called the critical angle for total reflection. In this case 
omi = 47, i.e. the refracted ray slides along the plane of the interface of the 
media. 

For 0 >6, the angle oft and, consequently, the value of sing® turn 
our to be imaginary. In the exponent of the exponential term of the refracted 
wave the following term arises: 


1 


1 
wes wes, (12 prefr 1 
2 (1-sin2 omy) z =| 282 (soet i)e], 





i( refr refr) 5 =; 
cos 8 Be 
i(k ) i( B A n2 
12 
This means that in the optically less dense medium damping takes place 
according to the exponential law 


we in2 grefr 3 
En~ exp| -A (sintet)? z] 3 (37.10) 


niz 
The effective penetration distance of the electromagnetic field into the 
optically less dense medium is of the order of magnitude of 5, where 


in2 prefr -4 
a (87.11) 


a~a( 
2 
i) 





576 HIGH-FREQUENCY FIELDS Ch. S 


where X = (we /c)} is the wavelength divided by 27. 

Since there is no absorption in the ideal dielectric considered, the weaken- 
ing of the field in the optically less dense medium can be associated only with 
the release of electromagnetic waves backward into the first medium. A direct 
calculation confirms this. The reflectivity R for 0 > 0, turns out to be unity. 
The effect described is called the phenomenon of total internal reflection. 

We cannot, within the framework of this book, dwell on the problem 
of the appearance of polarization under certain conditions of the reflection 
of unpolarized waves *. 

In conclusion we shall consider briefly the reflection of electromagnetic 
waves from the surface of conductors. An electromagnetic wave incident on 
the surface of a conductor induces in it a considerable conduction current. 
Free charges in the field scatter the wave strongly. Inside the conductor the 
field is rapidly attenuated. Therefore it is natural to expect the surface of the 
conductor to possess appreciable reflecting properties. However, part of the 
energy of the electromagnetic field will be dissipated in the conductor in such 
a way that the coefficient of reflection will have a value which is somewhat 
less than unity. If in the formula for R we formally replace the refractive 
index by its value as given by formula (31.13), then we easily find, taking 











into account that ej = 1 and e5 = en =n- ik; nK |: 
R= 1—ntik 2 _ 1—2n+n2+x2 a 
lłn—ik) 1 + Qn +2 +x? 
1 — 2n + 2n? 2 ONE 
ee —_——_——— + ] -—=1-2(-—} . (37.12) 
1 + 2n + 2n2 n Ga) 


This formula, which is called the Hagen—Rubens formula, is in good agree- 
ment with experimental data on the reflection from metal conductors of 
electromagnetic waves lying in the range of radio and infrared frequencies of 
the spectrum. For higher frequencies the relations of classical macroscopic 
electrodynamics turn out to be inapplicable. 

The characteristic metallic glitter is due to the high value of the coeffi- 
cient of reflection in the optical range. In the limiting case of a conductor 
with an infinitely large conductivity the coefficient of reflection is R = 1. This 
means that a high-frequency electromagnetic field does not penetrate at all 


* See, for example, J.A.Stratton, Electromagnetic theory (McGraw-Hill, New York, 
1941); L.D.Landau and E.M.Lifshitz, Electrodynamics of continuous media (Pergamon 
Press, Oxford, 1960). 





§38 WAVE-GUIDES Saa 


into a conductor with an infinitely large conductivity. We shall call such a 
conductor an ideal conductor. In the range of high frequency fields the ideal 
conductor is an analogue of the conductor in electrostatics. 

Let us write the boundary conditions for the field vectors at the surface of 
an ideal conductor. Since electric and magnetic fields are absent inside an 
ideal conductor, it follows from (5.2) — (5.6) that 








Eg=0, 
ih Sn 
at the surface of (87.13) 
H,. = um i an ideal conductor tie 
tg c S? 
En _ Mos 
€ 


where the vectors E and H refer to the field in vacuum, and ig and wg are 
respectively the densities of the surface current and charge. Thus, at the 
dielectric—ideal conductor interface the electric field vector is directed 
normally to the surface, whereas the magnetic field vector is directed 
parallel to the surface. Although in nature there are no ideal conductors, the 
approximation of an ideal conductor often yields adequately the character 


of the behaviour of the electromagnetic field at the surface of bodies with a 
high conductivity. 


§38. Wave-guides 


The transmission of electromagnetic energy over relatively small distances 
plays an important role in present day communication engineering. It is 
brought about by means of the excitation of the electromagnetic field in 
tubes with metallic walls of various forms and cross-sections, called wave- 
guides. In this book we cannot dwell on methods for the excitation of fields, 
and shall confine ourselves only to the study of the process of propagation of 
electromagnetic waves in wave-guides. 

As will be clear from what follows, the propagation of electromagnetic 
waves in wave-guides differs essentially from the propagation of plane elec- 
tromagnetic waves which are not limited in space. Therefore the problem 


of propagation of waves in wave-guides is not only of practical but also of 
theoretical interest. 


578 HIGH-FREQUENCY FIELDS Ch. 5 


To avoid complicated calculations, we shall restrict ourselves to consider- 
ing a wave-guide of rectangular cross-section with walls made of an ideal 
conductor. Let the sides of the rectangle be equal to a and b; for definiteness 
we assume that a> b. We assume that a monochromatic plane electromag- 
netic wave moves in the wave-guide (in the plane z = 0). It is natural to 
assume that field vectors inside the wave-guide depend on time and the coor- 
dinate along the wave-guide according to the law 


E, H ~ exp [i(wt—k,z)] . (38.1) 


Substituting (38.1) into the wave equations, we find 





2 2 2 
game se JE. (38.2) 
əx? ay? Z v2 
2 2 2 
aH, 0-H _ (eS) H. (38.3) 
ax2 ay? 2 oy 


Here and below E and H are unknown functions of the coordinates x and y, 
since their dependence on z is already taken into account. The connection 
between the vectors Eand H is determined by the equations for V X E and 
Y X H. Substituting (38.1) into these equations, we find 





= + ik, E, =— Eo rar (38.4) 
=P= = =— Et Hoe (38.5) 
eae (38.6) 
TE, ik, H, To (38.7) 
AE 22 TRE (38.8) 
OB ys Oih = IEW p (38.9) 





§38 WAVE-GUIDES 579 














Formulae (38.4)—(38.9) allow the vector components EY, Ey, H, and H, 
to be expressed in terms of £, and H,. A simple calculation gives ; 
E, = (ict LD eo Ta), (38.10) 
c?k2 ve euw? 2 ax oy 
ait sete ies = fl wae? Bl 
fy = TERRE (ic a gy C ax ) (38.11) 
H, = — ( iewc EEN se) ; (38.12) 
as le euw? + OK ay Z ax 
7 l alye dEr. a OH 
H, = ae b ayes (ic ax tierk, F ih (38.13) 


At the surface of the wave-guide the boundary conditions for an ideal 
conductor (see §37) must be fulfilled. 


We first try to find the solution of eqs. (38.2) and (38.3) in the form of 
transverse plane waves, i.e. we assume 


13 lol OR 
Then from formulae (38.10) — (38.13) it is clear that all components of the 
field are equal to zero, provided 2K? — euw? #0. If, on the contrary, 
c2k? = euw2, as is the case of a monochromatic plane wave in a non- 

| limited medium propagating with a velocity c/(eu)? , then (38.2) gives 


0? Hy) , 2 H(xy) o 
ax2 ay? i 


ie. the magnetic field satisfies: the two-dimensional Laplace equation, and 
over the entire closed boundary of the region (at the edges of the rectangle 
x=0, a; y=0, b) the magnetic field by virtue of (37.13) is directed along 
a tangent to the boundary. In the mathematical theory it is shown * that the 
only solution of such a boundary value problem is H= 0. If there is no 


* See, for example, V.I.Smirnov, A course of higher mathematics (Pergamon Press, 
Oxford, 1964). 


ST 


580 HIGH-FREQUENCY FIELDS Che 


magnetic field in the wave, then, obviously, the electric field is also equal to 
zero. 

Thus we see that transverse electromagnetic waves cannot propagate in a 
rectangular wave-guide with ideally conducting walls. It should be stressed 
that this conclusion is not associated with the form of the wave-guide, but 
refers to any wave-guide made in the form of a simple tube of any cross-sec- 
tion whose walls are ideally conducting. In the case where the boundaries of 
the region are open, for example, in the case of the gap between two infinite 
ideally conducting plane surfaces, or in the case of multiply connected space, 
for example a wave-guide formed by two concentric cylinders, this conclusion 
is no longer valid. In such systems the propagation of transverse electromag- 
netic waves is in principle possible. 

The meaning of this result is very simple. In a transverse wave in a wave- 
guide the lines of the magnetic field must be directed along a tangent to the 
walls and must have the character of closed curves. In this case the lines of the 
field do not enter into the ideal conductor and therefore do not enclose any 
tubes of conduction current. A longitudinal component of displacement 
current is absent, so that the lines of the magnetic field do not enclose any 
tubes of displacement current either. However, from general considerations 
it is clear that the lines of a magnetic field which do not enclose any current 
tubes cannot exist. 

This kind of reasoning also allows one to understand why the existence of 
transverse waves is possible in the open wave-guide or in a wave-guide formed 
by concentric cylinders. In an unbounded space the lines of the magnetic 
field can be open and go off to infinity. In a wave-guide formed by concentric 
cylinders they can enclose the tubes of current flowing on the surface of the 
internal cylinder. In both cases the existence of magnetic and electric fields 
travelling along the wave-guide in the form of transverse waves is possible. 

It turns out, however, that the absence of transverse waves in a wave-guide 
does not mean that the propagation of an electromagnetic field in it is im- 
possible. As will now be shown, the formation of longitudinal waves, which 
cannot exist in unbounded space, is possible in a wave-guide. By longitudinal 
waves we understand waves which have a non-zero field component in the 
direction of propagation of the wave. From formulae (38.10) — (38.13) it is 
clear that two independent possibilities must be considered: 


ih) E 20), H,=0, 


2 H, +0, E, =0. 





§38 WAVE-GUIDES 581 


In the first case the magnetic field in the wave has two components, Hy 
and H,,, and is purely transverse. Such waves are usually called transverse 
magnetic waves or, briefly, TM-waves. The electric field has a longitudinal 
component and two transverse components. 

In the second case the electric field of the wave has a transverse character 
and the wave is called correspondingly a transverse electric wave or, briefly, 
TE-wave. 

In the case of TM-waves, on the basis of (38.2) we have 


0*B, 072, 2 
Ales z —2-(,2_ |g, (38.14) 
ax? ay? ENMI 
and 
BISO for O EO (38.15) 


by virtue of (37.13). Below we shall show that it is sufficient to satisfy only 
these boundary conditions. In order to satisfy eq. (38.14) and the boundary 
conditions (38.15), one must write for £, 


E, = A sin (kx) sin (k, y) exp [i(wt—k,z)] , (38.16) 
where 

Ki + ky +k? = w/v? (38.17) 
and, moreover, 

k, =nm/a ; ky = mn/b , (38.18) 


where m and n are integers which are not equal to zero. This solution has the 
form of waves travelling in the positive direction of the z-axis and standing 
waves in the planes z = const. 

The transverse magnetic wave corresponding to the numbers m and n is 
usually denoted by TM,,,,,. If the value of Æ, is known, then from (38.10) — 
(38.13) one can easily find the remaining components of the field. We shall 
not write down the corresponding formulae, and shall only point out the 
following fact. It is easily understood that the boundary conditions for all 
components of the electric and magnetic fields are automatically fulfilled. 
For example, in the plane y =0 the normal component of the magnetic 
field is 


582 HIGH-FREQUENCY FIELDS Ch. 5 
ðE an 
H, = Hy, Fp PANNE or = Oe 


analogously, 


Hence, as we have already pointed out, it is sufficient to satisfy the boundary 
conditions (38.15). Thus, at the walls of the wave-guide the lines of the 
magnetic field are tangential to the surface. They form closed curves em- 
bracing the longitudinal lines of the electric field. 

Let us discuss formula (38.17) in more detail. We rewrite it in the form 








DL DP ADV 
| k, = (5-7 e EN. (38.19) 
i = v4 a? be 

For given values of m and n the quantity k, has a real value only for 


MLN 2 
i w2 NCH -(* n?m? a) Laurel 


= > ‘ 
A2 472v2 4n2v2 a? b2 / 4n2 NA 





(88.20) 


where wt!’ is a certain critical frequency, and Agrit is the corresponding 
wavelength. 
If k, turns out to be imaginary, which corresponds to w < Wert: then we 
arrive at an exponentially decreasing expression for Æ, and the other field 
p components instead of a wave travelling along the z-axis. Thus, only waves 
with w > w or A < Aert Can propagate through the wave-guide. 
The largest value of the wavelength of the propagating wave, APY, is ob- 
tained for the TM,;-wave (m= 1 and n= 1) which is the wave of TM-type 


of lowest order. Namely, 


2ab 
(a2+b2)2 i 


ATM 1 
The phase velocity of the TM,,,,-wave is obviously equal to 


c 





om 
Ke (eu)? (1-22 ((m/2a)24(n]26)2] E 


c v (38.21) 
(en? (1-022, 12 -O2R2,)13 | 


Uphase 





= | T f 
ee aŘŘŘĂÁ_—_Á 


§38 WAVE-GUIDES 583 


Since A < cnt always, the phase velocity of waves in a wave-guide is always 
larger than the velocity of light v in a corresponding medium, and for 
A > Arit it increases indefinitely. In particular, if no dielectric is introduced 
inside the wave-guide (i.e. e= = 1), Uphase is always larger than the velocity 
of light in vacuum c. 

The group velocity can easily be found from formula . 38.17): 





dw_ c n2 \3 
Uproup =a = ——T ( ) , (38.22) 
up dk, (eu)? Ne 


E 
The group velocity is always smaller than the velocity v = c/(eu)? and tends 
to zero as À approaches À grit- 
From (38.21) and (38.22) it is clear that Uphase and Ugroup satisfy the 
general relation UphaseYgroup 7 V^- 
Let us now consider TE-waves. For such waves we have 


a*H, 07H, 
sfe ammm 
ax2 ay? 





A 
MpD N 
= («2 SJEN (88.23) 


By means of (38.10) and (38.11) the boundary conditions for the tangential 
component of the electric field can be expressed in terms of the derivatives of 
H,. Then instead of 


BAZO for y=0,b 
we have 
0H,/dy=0 for y=0,b. (38.24) 


Analogously, instead of 


Ey =0 for x=0,a 
we have 
dH,/ox=O for x=0,a. (38.25) 


The solution of eq. (38.23) for the boundary conditions (38.24) and (38.25) 
can be written in the form of a TE-wave: 





f 
f 
| 
| 
| 
| 
Í 


584 HIGH-FREQUENCY FIELDS Ch. 5 
H, = B cos (kx) cos (kyy) exp [i(wt—k,z)] , (38.26) 
where 
k, =nm/a ; ky =an/b . 


Here m and n are integers. In contrast to the case of TM-waves, each of these 
numbers (but not both of them simultaneously) can take on the value zero. 
Formulae (38.19) — (38.22) remain valid for TE-waves. In them, however, 
one of the numbers m or n can be set equal to zero. The TEyg-wave is the 
TE-wave of lowest order. For the TE;g-wave the critical wavelength is 

AMAX = 2 

The most important conclusion which can be drawn from the theory 
described is that the transverse character of plane waves is closely associated 
with the unlimited dimensions of the medium in which they are propagating. 
When waves are propagating in a limited region, purely transverse waves can 
exist only under particular conditions (multiply connected regions). Under 
usual conditions the waves do have a longitudinal component of the electric 
or magnetic field. We recall also, that in an unlimited space, but near the 
emitter, the electromagnetic wave has non-zero longitudinal (radial) field 
components £, and H, (see §25 and § 26 of Part I). 

In the calculations for a wave-guide of the simplest form which we have 
presented above we have not taken into account effects which in practice 
play an important part. Such an effect is, first of all, the power loss asso- 
ciated with the non-ideal character of the walls of the wave-guide and the 
dielectric which fills it. Also we have not considered the theory of wave- 
guides of a more complex form and wave-guides which are inhomogeneous 
over their length. 

Neither have we touched upon numerous problems associated with various 
applications of wave-guides, including their use in linear accelerators for 
charged particles. For all these problems we refer the reader to the com- 
prehensive specific literature *. 


* See, for example, L.A.Vainshtein, Elektromagnitnye volny (Electromagnetic waves), 
Sov. Radio, 1957; S.Ramo, J.R.Whinnery and Th.Van Duzer, Fields and waves in 
communication electronics (Wiley, New York, 1965). 





§39 PASSAGE OF FAST PARTICLES THROUGH MATTER 585 
§39. The passage of fast particles through matter 


The problem of the energy loss and radiation by fast particles moving in 
matter is of great importance in contemporary physics. 

As we shall see below, there are several different mechanisms of energy 
loss by particles moving in matter. 

First of all it should be recalled that, as we have seen in §43 of Part I and 
§26 of Part II, particles which undergo collisions with atoms and are de- 
flected by the nuclear field emit bremsstrahlung. According to eq. (26.15) of 
Part II the bremsstrahlung of ultra-relativistic particles is inversely propor- 
tional to the mass of the particle and plays an important part for light 
particles (electrons). When heavy charged particles (protons, ions) pass 
through matter, other sources of energy loss usually play a greater part. The 
charged particle moving in matter interacts with the atoms of the substance 
and polarizes them. In other words, the moving particle produces a field in 
the substance. This field induces a reaction on the particle itself which 
decelerates it. The energy loss of the particle associated with this deceleration 
is obviously equal to the work done by the decelerating force. This is called 
the polarization energy loss, since the decelerating field is the polarization 
field produced by the moving particle. The medium which is polarized by the 
particle can however emit transverse electromagnetic waves. To do this it is 
necessary that the polarization caused by the particle moving in the medium 
does not follow the particle. The moving particle leaves the medium in a 
polarized state and the medium emits the excess energy in the form of trans- 
verse electromagnetic waves. This source of energy loss is the Cerenkov— 
Vavilov radiation or briefly the Cerenkov radiation. 

We stress that Cerenkov—Vavilov radiation is not associated with any 
acceleration, since it is radiation emitted by the medium and not by the 
moving charge. On the other hand, as will be shown in the following, 
Cerenkov radiation is possible only in the case of motion of a particle with a 
velocity v which is larger than the velocity of propagation of the field in the 
medium c/n. This restriction expresses the above-mentioned condition of the 
lag of the polarization field behind the moving particle. We now go on to the 
simultaneous calculation of the polarization losses and Cerenkov losses of one 
particle moving in a macroscopically homogeneous and isotropic medium. 
Our purpose is to find the field produced in the medium by a single charge 
moving with a velocity v. 

We disregard the decrease in velocity during the motion and consider the 
velocity to be constant in magnitude and direction. We assume the medium to 
be transparent, so that the imaginary part of the dielectric constant is ef > 0. 
The magnetic permeability u is assumed to be equal to unity. 





586 HIGH-FREQUENCY FIELDS Ch. 5 


The current density in the medium corresponding to a uniform motion of 
one particle can be written in the form 


j= evô(r—vt). (39.1) 


Hence Maxwell’s equations will have the form 


_ 10B 

VXE=-7 ar? (39.2) 
_1əD _ 4re b 

VXH=— 5t p Vêl-vi). (39.3) 


We shall discuss below the conditions under which one can assume a 
medium to be continuous and disregard its atomic structure describing it by 
macroscopic quantities. 

We seek the solution of Maxwell’s equations by means of expansion in a 
Fourier integral. Writing formulae (33.1) — (33.4) for the field vectors and 
for the vector j the expression 


A l f 
r,t) =—— | exp [i(kr—wt)] j(k,w) dkdw , (39.4 
iC eae pl )] jk) ) 
we write the system (39.2) — (39.3) in the form 


| i[kXE(k,w)] = i = B(k,w) , (39.5) 


i{kXH(k,0)] = -i 2 D(kw) + i j(k,w) . (39.6) 


Forming the vector product of (39.5) and k, thereupon eliminating k XH 
from (39.6), and making use of the constitutive equation (33.4), we arrive at 
the equation 


2 
Ww A : 
[ior «| E;= ra 5 (39.7) 


which differs from (33.15) only in the presence of the right-hand side. The 
quantity j; in (39.7) denotes the Fourier component of the current density 


§39 PASSAGE OF FAST PARTICLES THROUGH MATTER 587 


j(K,w). For the isotropic medium we substitute for Eij its value from 
formula (33.9). We then find 








2 N a 
2 A “Ej kiki > w? E|] kjk “j 10 7 
{k28 ;;—kik;} E; êj- E; E; S 
F 


U 3 
: chk? e 


or, in the vector form 


we no 26) kik: 
(x2 =) (E y k(k E)) = SAS k(k E) = Ee 4 (39.8) 
c2 


k2 c2 k2 c2 





Forming the scalar product of (39.8) and the vector k, we obtain 


k-E=~i 7” (kj). (39.9) 


WE|| 


Substituting (39.9) into (39.8), we find 


(e ea) ~~ pare | (12 2n (kj)k , k(kj) Ea i 











c? k2 CO doaa E 
Hence 
__ 4nwi | (k-j(w,k))k , k(k-j(w,k))—k2j(w,k) 
w,k) = - - + — , 39.10 
K k2 | wey c2(k?-w2c~e)) ‘ i 


Going from the Fourier component to the electric field, we have 
Elr,t) = | El,k) exp [i(k-r—-wr) \dkdw = E; (r,t) + E3(r,t). (39-11) 


The first term E} (r,¢) has the form | 





inf E (k-j)k i (39.12) 


We 





We now substitute the value of j(w,k), making use of formula (III.8'). We 
have, obviously, | 


j(wo,k) = a 5(w—k-v). (39.13) 


588 HIGH-FREQUENCY FIELDS Ch. 5 


Then E} assumes the form 





. 4ne fiw dk k(k-v) 


oaea o N See Gre] = 


i= 


k 


———_ , (39.14) 
ke1\(k-v,k) 


= asa exp [i(k-t-k-vż)] 


where e\\(K-v,k) is the value of €|(w,k) forw=k-vy. 
The field E} produced by the moving charged particle in the medium 
induces a reaction on it. The particle is acted upon by a force 


F= e(E} =v; +E (VXH) pay, > 


where the index shows that the value of the field at each instant of time must 
be taken at the point r= v¢ at which the particle is found. The energy loss of 
the particle (we shall relate it to unit path in matter) is equal to the work 
done by this force per unit path. Since the magnetic part of the Lorentz 
force ec-!(vXH) does no work the work done by the force F per unit path 
is equal to 


v 
Wi =e(5 E) =- a6; = 








A je? rile f k-v 1 
= Re [i fakes [i(kr—kvo)] “> el 
ie? ed 
= Re j— dkd x 
| za i ke(w,k) Ge) 


The notation Re is introduced in order to show that the expression for the 
work W} is real. This work has a clear physical meaning. It is determined by 
the value of €j and represents the work done on the particle by the longi- 
tudinal polarization produced by the particle. Thus, dW,/dr is a measure of 
the energy loss A£} of the particle per unit path. This energy goes into the 
production of the longitudinal polarization. Later, in calculating A£}, we 
shall discuss the question as to what processes actually occur in a medium 
when a moving particles produces a longitudinal polarization in it. 

We now come back to the second term of (39.11). According to formulae 
(39.11) and (39.8), we have 


§39 PASSAGE OF FAST PARTICLES THROUGH MATTER 589 


k?v—k(k 
E- (r,t) = om 3 J akde exp [i(K-r—wr)]— Ey a tay k-v) A = 


(k-v) k?y- k(k-v) 

dk k-r—k-vr 

ef exp li( Tr v )] jn An (k- y)2¢ —ei(k- v k) 
(39.16) 





E 


Correspondingly, the work done on the charge due to the field E, and the 
energy loss AF, of the particle per unit path are equal to 











ip2 : 2_(Ie-v)2 4-2 
W> = — AE; = Re ie 2 fak (k-v)[v*—(k-v)?k-?] o 
2n2vc2 k? — (k-v)2c~2€)(k-v,k) 
r=v 
2—ws?k-?) 5(w—k-v) 
= Re dkde 2 ; 39.17 
E 2n2uc2 Jaz k2 —w 26-261 (w, k) S ) 


The energy loss AF, determines the Cerenkov loss of the particle. This is 
associated with the excitation of a transverse electromagnetic field in the 
medium, i.e. with the radiation of the polarized medium. 

Taking account of spatial dispersion allows us to separate clearly the two 
types of energy loss: polarization loss and Cerenkov loss. 

In what follows we shall restrict ourselves to the calculation of losses in a 
medium without spatial dispersion *. For this, according to (33.11), we have 
to set in the formulae obtained 


e(w,k) = €)(w,k) = C(O) ay ; (39.18) 


We pass on to the calculation of the integrals in formulae (39.15) and (39.17). 
We have, taking into account (39.18), 


‘pe 
ABN= 2 Rel S | : 39.19 
f (= J k€(w) C ) 





* For losses taking into account spatial dispersion see, for example, V.P.Silin and 
A.A.Rukhadze, Elektromagnitnye svoistva plazmy i plazmopodobnykh sred (Electro- 
magnetic properties of plasma and plasma-like media), Moscow, 1961; V.D,Shafranoy, 
Elektromagnitnye volny v plazme (Electromagnetic waves in plasma), collected papers 


Voprosy fiziki plazmy (Problems of plasma physics), Vol. Il, Moscow, 1963. 


V.L.Ginzburg, Propagation of electromagnetic waves in plasma (North-Holland Publ. 
Co., Amsterdam, 1961). 


E E -er 


590 HIGH-FREQUENCY FIELDS Ch. 5 


We have already pointed out above that all our considerations will refer to 
transparent media, more precisely to media with a negligibly small absorp- 
tion. This means that in the expression for the dielectric constant e(%w) = 
e'(%w) + ie'(w) we have to go to the limit e!(w) > 0. This allows us to greatly 
simplify all subsequent calculations. That is, we can write 


Re aie i -rv atic!) an CƏ : 
° do) OEO KECE (e1)? + (el)? 





In the limit e > 0, we can make use of formula (III.4’), Vol. 1, and write 


=— 76 [e(w)] . (39.20) 





lim Re 


l = lim [ = | 
dso E(w) cis (e*)? éi)2 
Then the formula for polarization loss assumes the form 


(k-v) 





ABI =, Seolo ke Wk 5[e(k-v)] dk, (39.21) 


where we have made use of the fact that only the frequency w= k-vgives a 
contribution to the loss. If the explicit form of e(%w) is known, then the inte- 
gral in (39.21) can easily be calculated. In §46 such a calculation will be 
carried out for the case of a plasma. 

We note that formula (39.21) does not contain the mass of the particle. 
This is quite natural, since the energy losses are associated with the polariza- 
tion of the medium. Polarization losses are therefore the main source of 
energy loss for heavy particles moving in matter for a wide energy range. 
Light particles, for example electrons of relatively high energies, lose most 
of their energy due to bremsstrahlung. 

We now go on to the calculation of the Cerenkov losses, which can be 
carried out by the same method. We have 


Re Saxo = (kev)?k? (39.22) 


AE. = 
2 Eke o 





2n na 


We choose the direction of motion of the particles as the z-axis and introduce 
new variables 


w=k-v=k,v, 





§39 PASSAGE OF FAST PARTICLES THROUGH MATTER S91 
22 A P O) 
q ky tke k DOLORAN 
Then 


dw 
dk = dk, dk, dk, = qdqdy pa 


In the new variables we have 


2n2y2c2 


Reece refi aes ie 
0 





2r 2.9 | 


4 q?v? 
J vi q? + w? [v-2—e(w)c-2] 


=e fete Jr q3.dq 1 I 


q? + wre? q2 + w2y-2 — w2c-2e(w) 





For a non-absorbing medium we can write, analogously to (39.20), 


lim Re| : | =— 75 [aro 4 = «a)] i 
el>0 q?tw?u 2- 2e(w) v2 c2 





Hence 


1 €w) 
Ay w dw se en : 
2 sf atts oe + a2 = zèle v2 c2 
Introducing the new variable u = q2, we have 


Ta e Ae 


0 q? + w2v-2 v 
udu (Kw) 2 w2 »: 
e -2 ofu (> K P) = 
1 c2 
aa) > u> c/n 


v? e(w) 
s v<ce/n, 





0 








592 HIGH-FREQUENCY FIELDS Ch. 5 


1 

where n = €2(w). 
Hence we arrive finally at the following expression for the energy loss per 
unit path due to the Cerenkov—Vavilov transverse electromagnetic radiation: 


co 
e2 


2ee ae 2 2 
AE, = ff ode (1 £ E É ) odo, (39.23) 


2c? _% ve(w)/ c? v?e(w) 





since e(%w) is an even function of its argument (see (32.9)). 

Thus we obtain the following basic properties of the Cerenkov—Vavilov 
radiation: 
(1) it arises only for particles moving with a velocity which is larger than 
Veutofy = ¢/n (cut-off of radiation), 
(2) it depends on the charge of the particles, but not on their mass, 
(3) the radiation lies in the visible spectrum and in part of the ultraviolet 
spectrum. For shorter waves n < 1 (see §34) and the radiation is no longer 
possible, 
(4) the energy emitted per unit path in unit frequency interval has a charac- 
teristic spectral distribution (39.23), 
(5) the radiation arising at a given point of the trajectory propagates over the 
surface of a cone with the vertex at this point and the axis coinciding with 
the direction of motion of the particle. The angle 0 of the cone is determined 
by the condition 


cos@=—. (39.24) 


In conclusion we stress that the energy loss due to Cerenkov—Vavilov 
radiation is very small and amounts to only about 0.1% of the energy loss of 
very fast particles in matter. The rest is determined by bremsstrahlung and a 
number of other phenomena. The importance of Cerenkov radiation lies in 
the fact that new detectors of fast particles are based on the use of it. These 
detectors, which are called Cerenkov counters, have now become one of the 
chief working instruments in the field of high-energy particle physics. By 
Observing the Cerenkov radiation one can not only record the passage of 
particles but also according to (39.24) determine directly the magnitude 
and direction of their velocity, according to (39.23) determine their charge, 
and separate particles with the same momentum but a different mass (making 
use of the existence of the cut-off Uoyt op = Pcutofr/™), and so on. 

In conclusion we shall discuss the problem of the limits of applicability of 
the macroscopic treatment. Fast particles actually interact with individual 





§39 PASSAGE OF FAST PARTICLES THROUGH MATTER 593 


atoms or, more precisely, with atomic electrons. For the macroscopic de- 
scription of the process to be valid it is necessary that during the time of 
flight of the particle past the atom the intra-atomic electrons do not undergo 
an appreciable displacement. The particle in flight is moving in the quasi- 
stationary field of many atoms. 

Thus the velocity of the particle must be large compared with the charac- 
teristic velocity of atomic electrons. 

An interesting source of energy loss due to radiation by a uniformly 
moving particle is the so-called transitional radiation. It arises when a particle 
passes from one medium to another. In order to simplify the formulae we 
assume that a particle which is moving uniformly in vacuum with a velocity 
u~c falls on the boundary of a medium with a dielectric constant e(w). 
We assume the boundary of the medium to be the plane z = 0, and the inci- 
dent direction of the particle to be normal to the boundary. We choose the 
instant of impact of the particle on the boundary to be the time t= 0, and 
find the vector potential of the field of the particle according to formula 
(32.10) of Part I. 

Obviously, for the Fourier component A,, at a large distance from the 
charge we have 


we zed Jo exp [i(k-r(t)—oort)] dt = 


ev 


0 co 
= = Jep li(k-vez—w)] dr+ f exp [it(k-v—w)] ar] = 
0 





2ncr 
-co 


co 


0 1 
-A ge ball ve? cos@ s v cos | 
ape | fexp| iof SoS" 1) art f exp ioe( >S 1} |de} . 


—co 


0 


Making use of the definition of ô, and 6_ functions (see Appendix III, Vol. 1), 
we find 





5 5 
ev f = + i (39.25) 


cole T ma T 2-1 
2ncwr i |} _ye-lez cosg 1 —ve* cos 


The intensity emitted according to formula (32.14) of Part I is equal to 


——_— 


a 





594 HIGH-FREQUENCY FIELDS Ch. 5 





Dyes? 1 l 2 
== mall | sin20dQ. (39.26) 
4n2c?,.L1—ve“1e2 cos@ l—vc™! cos 


The meaning of this result is quite simple. In motion in a medium the role of 
the relative velocity v/c is played by the quantity ve?/c, which determines the 
optical path in the medium. When the particle passes from one medium to 
another (or to vacuum) the quantity ve2/c changes discontinuously. This 
change is equivalent to an abrupt change in the velocity, i.e. to an accelera- 
tion of the particle. Formulae (39.25) and (39.26) show that the radiation is 
directed mainly forward (9~0) and that it contains very high frequencies 
(high harmonics). We shall not dwell on the calculation of the total intensity, 
which can be accomplished by integrating over all angles. We note only that 
the total intensity turns out to be proportional to the energy of the particle. 
The analogous calculation for v <c turns out to be more complicated, since 
in this case it is necessary to take into account the phenomena of reflection 
and refraction of electromagnetic waves at the interface. 





Matter in the Plasma State 


§40. The general characteristics of plasma 


The problem of the passage of current through gases and the behaviour of 
gases conducting current in electric and magnetic fields is of great interest in 
present day physics. 

As is well known, gases are normally non-conductors. If, however, a suf- 
ficiently large ionization is produced in the gas, i.e. if a sufficiently large 
number of free electrons and ions (positive and negative) are produced, the 
gas will become conducting. The process of the passage of current through 
the gas is called gaseous discharge. Depending on the mechanism causing the 
ionization of the gas the gaseous discharge is called non-self-maintained or 
self-maintained. 

In the first case the basic ionization is produced by external sources (for 
example by y-radiation or by a high temperature maintained by an external 
source). 

In the self-maintained discharge the initial ionization is caused by electrons 
emerging from a cold cathode (tre Townsend discharge and glow discharge). 

The density of the current which can flow through the gas depends, in the 
first place, on the number of ions produced per cm? of the gas. In particular, 
if the entire gas is completely ionized current densities can be very large. 


S95 


TERAM o 


— 


596 MATTER IN THE PLASMA STATE Ch. 6 


The self-maintained discharge displays a great diversity of properties. How- 
ever, it possesses one outstanding feature: the space in which the gas dis- 
charge takes place can be divided into three regions — the cathode region, the 
anode region and the plasma region. 

The properties of the cathode region and anode region depend on the 
mechanism of the discharge. In the cathode region the ionization of the atoms 
of the gas by electrons emerging from the cathode occurs. The greatest charge 
is concentrated in the cathode and anode regions and the major fall of the 
potential difference applied to the electrodes also takes place there. The size 
of these regions is, as a rule, not large, and they occupy only a small part of 
the space between the electrodes. The largest part of the interelectrode space 
is occupied by an ionized gas which is on the average electrically neutral. This 
region of the discharge is called a plasma. In a plasma the number of positive 
ions is on the average equal to the number of electrons and negative ions per 
unit volume. Besides ions and electrons, a plasma can also contain a larger 
or smaller number of neutral atoms or molecules. 

The properties of a plasma in which we shall be interested in what follows 
do not depend on the detailed properties of the discharge nor on its character. 
The behaviour of a plasma plays an important part in the phenomena of 
gaseous discharge, a process which finds a wide variety of applications in 
modern engineering. A special interest in high-temperature plasmas has 
recently arisen in association with studies on controlled thermonuclear reac- 
tions as well as in connection with a number of astrophysical problems. 

As is well known, in order to obtain thermonuclear reactions it is 
necessary to reach such high temperatures (higher than 108 degrees) that the 
energy of thermal motion of nuclear particles is sufficiently large to overcome 
the energy barriers which hinder the penetration of nuclei into each other. 
At such temperatures atoms are completely ionized and matter consists of a 
stripped plasma. Temperatures which are necessary for thermonuclear reac- 
tions to take place exist in the interiors of stars. 

Under laboratory conditions it has not up to now been possible to prepare 
a plasma of the necessary temperature. However, intense investigations on 
high-temperature plasmas, which have already yielded a number of important 
results, are being carried out. 

In astrophysical conditions matter is found in the plasma state not only in 
the interior of stars but also in stellar atmospheres and in clouds of inter- 
stellar matter. 


§41 EQUILIBRIUM PLASMA 597 
§41. Equilibrium plasma 


We naturally begin the study of the properties of a plasma with considera- 
tion of the theory of an equilibrium plasma. 

We assume for simplicity that the plasma contains charges of only two 
kinds: positive ions with charge pı and electrons. In order to give a general 
character to the relations to be obtained we shall also call the latter ions and 
shall assign them the charge pz = 1. Then the condition of electrical neutrality 
for the plasma can be written in the form 


NP, +Nzp2=0, (41.1) 


where 7, and 75 are the mean numbers of ions and electrons respectively per 
unit volume. 

In considering the equilibrium properties of a plasma we shall confine our- 
selves to the approximation of an ideal gas. In this approximation the 
Coulomb interaction between charged particles can be assumed to be small in 
comparison with the thermal energy: 


PyP2e2/1<kT, (41.2) 


where Tis the mean distance between the ions. The latter is connected with 
the number of ions per unit volume (the concentration of the plasma) NV by 
the relation 


TN, (41.3) 


so that the condition of ideality of a gaseous plasma can be written in the 
form 


vat. (41.4) 
PyP7e 
The concentration of the plasma N is connected with the numbers 7, and 7 
by the relation 


N=n, tm. (41.5) 
If the inequality (41.4) is satisfied, then the plasma, in the zeroth order 


approximation, can be considered as an ordinary gas characterized by a tem- 
perature T. 


598 MATTER IN THE PLASMA STATE Ch. 6 


The particles of the plasma will have a Maxwell velocity distribution and a 
uniform distribution in space. The Coulomb interaction between charged 
particles gives rise to a certain mean electric field in the volume of the plasma 
characterized by a potential y. In our approximation the change of the 
properties of the gas caused by this field can be assumed to be small. 

In order to find the value of ọ the following reasoning can be applied. We 
mentally single out a certain arbitrary ion in the plasma located at a point O 
which is chosen as the origin, and find the total mean potential of the electric 
field Y in the neighbourhood of the point O. The potential Y is produced by 
all ions (including the ion located at the point O). The averaging is carried out 
over all times of observation, during which the ions assume all possible posi- 
tions in the plasma. 

We consider a volume element dV located at a distance r from the origin 
O. Let the potential of the electric field in this volume be equal to (7). In 
view of the isotropy of the field, the potential Y depends only on the ab- 
solute value of r but not on the direction of the radius vector. For a low 
concentration of plasma one can apply the Boltzmann distribution law to 
the ions in the field, writing for the number of particles in the volume dV 
the expressions 


n,dV=A exp [—p,eg/kT] dV , (41.6) 
n dV =B exp [—pep/kT] dV , (41.7) 


where 7, and 7 are the numbers of positive and negative ions in unit volume. 

The constants A and B can be found in the following way. At an arbitrarily 
high temperature T > ce the field produced by the ions in the plasma cannot 
affect their spatial distribution, because their potential energy will be 
negligibly small. Hence as 7° both distributions must go over into a 
uniform distribution of the particles in space, i.e. 


ndV=n,dV, (41.8) 

nydV=n,dV, (41.9) 
where ny and no are the mean numbers of positive and negative ions per unit 
volume. 


Comparing (41.8) and (41.9) with (41.6) and (41.7), we find 


ny dV =n, exp [—p,eg/kT] dV , (41.10) 


§41 EQUILIBRIUM PLASMA 599 
nydV = ny exp [—p e9/kT] dV , (41.11) 


According to formulae (41.10) and (41.11), in the volume near the point 
O, where the ion which we have singled out is located, there is a charge 


de = (np e exp [—peg/kT] + Nop e exp [—pre9/kT]) dV . (41.12) 


This charge is determined by the fact that the probability of finding an ion in 
dV of the same sign as the ion at point O is somewhat lowered, whereas that 
of finding an ion of the opposite sign is somewhat increased, in comparison 
with the probability where we do not take into account the inter-ion inter- 
action. In this sense it is said that a non-uniformly charged ion cloud arises 
around the ion located at point O. 

It goes without saying that in practice there is no cloud around each of 
the ions, because we have singled out the ion at point O only for convenience 
of the argument, and all the ions in the plasma are equivalent. There is only a 
certain correlation probable between the positions of any pair of ions in space. 
This can be expressed in other words: it can be said that each ion produces 
around it an ion cloud and, at the same time, is a part of ion clouds of all 
the other ions in the plasma. 


By means of (41.12) one can obtain the mean charge density at the point r: 
p(r) = de/dV = e(nyp, exp [—pyeg/kT] + Ap exp [—pre9/kT]). (41.13) 


Since by our assumption the energy of the inter-ion interaction is small in 
comparison with kT, one can expand the exponential expressions 
exp [—p,ey/kT] and exp [—p e9/kT] in a series, writing 

Zins 2— 

e“ (p1n,tP3n2) _ 


p(r) + — ET y(r) . (41.14) 


The mean potential of the field Y at a given point of the plasma is 


connected with the mean charge density p at this point by the equation of 
electrostatics 


V26=—4np/e . (41.15) 
Hence for y we find the equation 


V25=K20, (41.16) 


— e 2 ee E 


i = eee 1o 


600 MATTER IN THE PLASMA STATE Ch. 6 


where x2 denotes the essentially positive quantity 
PN p= Poon 25 
E2) 


kT (41.17) 





K 


Eq. (41.16) or eq. (41.15), in which p is defined by (41.13), is cailed the 
Poisson—Boltzmann equation and represents the basis of the theory of an 
equilibrium plasma. 

The solution of eq. (41.16) satisfying the requirement of isotropy in space 
can easily be obtained in polar coordinates. In polar coordinates, taking into 
account that Y does not depend on the polar angles 6 and y, eq. (41.16) has 


the form 


d2 


1 Sh satin 
r aa P KO. 


Introducing a new unknown function f= rg we obtain 


The solution of the last equation has the form 
FAQ OY + Gye 5 
whence it follows that 
O= Cyr) e + Cyr! er. (41.18) 


The constant C, = 0, because the exponentially increasing solution leading to 
an infinitely large potential for r > œ must be discarded. Hence 


p= Cyr} er. (41.19) 


The constant C} can be found from the requirement that near the charge 
which is singled out the potential of the field should coincide with the 
Coulomb field of the charge. Hence it follows that 


49 = Cyr > Pieler, 


§41 EQUILIBRIUM PLASMA 601 
so that 

CIS pye/e 
and, finally, 


pece. 


VPE 


(41.20) 

Formula (41.20) shows that the potential of the field near the ion de- 
creases, basically, according to an exponential law. At a distance r > x7! from 
the ion the potential turns out to be small. The quantity «7! characterizing 
the rate of decrease of the potential is called the Debye length. 

In order to elucidate the meaning of the solution obtained, we resolve 
the potential into the Coulomb potential of the ion which is singled out and 
the potential g’ of the field produced by all the other ions: 


we a 


tS 
Pas NH 


' 


From (41.20) we find 


ee Kr — | 
ge e. (41.21) 


Let us find the charge density corresponding to the potential Y’. By virtue of 
(41.15) we have 


pek? e7Kr 


= E = 
= Vv25' =— 
4n ? 4n r 








This formula shows that near the ion with charge p,e an ion cloud having 
the opposite sign of charge is formed. The charge density in the cloud de- 
creases exponentially with the distance from the central ion. The total charge 
of the cloud is equal to 


J Pav=-pye. 
0 


The meaning of this result is quite clear: around a given ion there are 


0 eS ee 


ee 


602 MATTER IN THE PLASMA STATE Ch. 6 


grouped with a larger probability ions of the opposite sign. The total charge 
of the ion cloud surrounding any given ion is exactly equal to the charge of 
the given ion. The presence around an ion of a cloud of ions of the opposite 
sign leads to a weakening or, as is usually said, a screening of the field of the 
ion. Hence the potential of the screened field near the ion decreases more 
rapidly than the Coulomb potential. The quantity x7! represents the mean 
radius of the ion cloud surrounding a given ion. 

Introducing the value of the Debye length k ~ e2N/KT into the condition 
of the applicability of the theory (41.4), we can rewrite it in the form 


NK >l, 


i.e. in the form of the requirement that the mean number of ions confined in 
the volume of a sphere with a Debye radius must be large enough compared 
with unity. 

The phenomenon of screening is of very great importance in the behaviour 
of a plasma. It is clear that any charge introduced into the plasma is screened 
at a distance of x~!. 

Let, for example, the plasma be confined in a container with solid walls. If 
there is a surface charge on the walls then the field produced by it will be 
screened and will penetrate into the plasma only up to a depth Kl. The 
distance K7! is thus the thickness of the shielding layer which is formed at the 
boundary of an equilibrium plasma and which insulates it from external 
influences. 

The same result can be obtained much more quickly by means of the 
method of correlation functions. That is, we make use of the smallness of the 
concentration of the plasma to close eq. (48.4) of Part III at the binary 
function. The latter contains the ternary function p13. For small concentra- 
tions one can write approximately 


Pir) = 1+ 1 2(7), (41.22) 
where W,5(r) < 1. Formula (41.22) means that the interaction of particles in 
the plasma leads to a weak correlation. Further, if the probability of triple 


collisions between particles in the plasma is disregarded, then the ternary 
function Py2; can be written in the form of the product 


P27 = P12P2jP1j~ 1+ Wiz + Woz + Wij- (41.23) 


Substituting this into (48.4) of Part III, we find, 


§41 EQUILIBRIUM PLASMA 603 


aviz a Uall oy Uy 
2z NA or 41.24 
Sree RT very Ver) oe, (ltVi2tWajt¥ay) ary. 41.24) 





We recall that the summation with respect to the index j is carried out over 
all (in our case two) kinds of particles. 

We drop, as small, the term ¥,7(0U,3/0r,). It is obvious that two inte- 
grals on the right-hand side reduce to zero: 


8U,2(Ir;—rQ!) 
peter) a 20, 


aU, (ir,—rl) 
J “Br Varam) dtj= 0. 





Indeed, they involve integration with respect to the angles of the vector 
dU/dr,, where U is an isotropic function of the corresponding variables. 
Hence, finally 
J 
aW12 1 9Ui2lti—T2) Ney, 


ate . ae 5 
Ste a a Ohi, Vit) or, “2 Glee) 





We take the divergence of eq. (41.25) with respect to the coordinates r} and 
take into account that the interaction is the Coulomb interaction, so that 


WD , £ 5 
WSs VU allr; —r321)= 4p ;p27e* (lr; —r3l)- 


Then we find 


4mp pae? 
WBO a a(n) +t pin DPM; - 


Setting 


W12 =P P27). 


we find finally 


y?y -2y = E E so), (41.26) 


a eee 


604 MATTER IN THE PLASMA STATE Ch. 6 


which is the same as the equation for the mean potential (41.16). The term 
with the 6-function allows one to take the boundary condition (41.19) into 
account automatically. 

It is easily seen that the correlation function satisfying (41.26) has the 
form 


W12 =P1P2( =r} ee”). (41.27) 


We shall now find the thermodynamic characteristics of an equilibrium 
plasma. The presence of the Coulomb interaction between the ions and elec- 
trons is responsible for the additional energy possessed by the plasma as 
compared to a neutral gas at the same pressure. This energy is obviously 
equal to £'=3V > enpi Where n; is the mean number of particles of the 
ith kind per unit volume, V is the total volume of the plasma, and 4; is the 
mean potential produced by all ions at the locus of the /th ion. 

The mean value of the potential of the electric field at distances r < x (for 
such distances the formulae derived above have a quantitative meaning) can 
be written in the form 


Qj = — pik 


Hence 


niw 


Wad 2 2e 
E 34 Vk DMP; =— e (a) (Dre? ) 
Consequently, the total energy of the plasma is equal to 
1 
E= DynikTV ae lem (Ere?) 


Making use of the Gibbs—Helmholtz formula (30.11) of Part III, we find the 
free energy of the plasma 


3 
E 2 2 
per We c= — WD) nena? (8 (Enp ; 


The pressure of the plasma is equal to 


rh), De Gla} re} 





niw 


§42 PLASMA IN STATIC FIELDS 605 


The pressure of the plasma turns out to be lower than that of an ideal gas of 
the same density. This result has the following simple meaning: the attraction 
between charges of opposite signs, which are distributed closer to each other, 
predominates over the repulsion between charges of the same region. 

In conclusion, we stress that although a plasma is a macroscopically homo- 
geneous medium, on the scale r < x7! it is inhomogeneous. This fact is of very 
great importance for electromagnetic processes in a plasma. 

Up to now we have considered the plasma to be a completely equilibrium 
plasma. However, very often one has to study a plasma which is in incomplete 
equilibrium *. Namely, since the mass of the heavy ions is very large in com- 
parison with the mass of the electrons, the energy exchange between them in 
elastic collisions proceeds very slowly. On the contrary, the energy exchange 
between electrons or between ions proceeds much more rapidly. 

If at a certain initial instant of time the plasma was in a non-equilibrium 
state, then after the lapse of the relaxation time 7 an equilibrium (Maxwell) 
distribution will be established for electrons and ions separately. Each set of 
particles can be characterized by the proper temperature T, and 7; respec- 
tively. However, for the equalization of the temperatures and the establish- 
ment of a common temperature T of the plasma corresponding to equilibrium 
between the electrons and the ions, a relaxation time 7; >7T is necessary. 

The presence of incomplete equilibrium in the plasma, which is charac- 
terized by two temperatures, does not have a very strong effect on the 
property of screening which we have described above. 


§42. Plasma in static electric and magnetic fields 


If a plasma is placed in an external static electric field E, then an electric 
current will arise in it. 

In the absence of an external electric field the Maxwell velocity distribu- 
tion is established for the ions and electrons in the plasma. When an external 
electric field is applied, the electrons and ions begin to move preferentially in 
different directions. A current arises in the plasma in the direction of the 
applied electric field. The density of this current is 


* For more details on incomplete equilibrium see, for example, B.G-Levich, Vvedenie 
v statisticheskuyu fiziku (Introduction to statistical physics) (Gostekhizdat, Moscow, 
1954). 





606 MATTER IN THE PLASMA STATE Ch. 6 
j=oE. (42.1) 


Up to now we have not tried to elucidate the meaning of the electric 
conductivity g, assuming it to be a macroscopic characteristic of the medium. 
Here, however, it is necessary, if only on the basis of a very rough model, to 
estimate the value of ø. We proceed from the assumption that ions and elec- 
trons form an ideal gas. Let the mean velocities, masses and mean free paths of 
the ions and electrons be denoted by vj, m;, A; respectively. In the presence 
of an external electric field ions and electrons in the plasma undergo an 
acceleration between collisions in contrast to a neutral gas. 

The mean velocity acquired by a particle under the action of the field Eis 
equal, in order of magnitude, to 

we eee 
ie ET, 
? 


I 


where the mean time of flight between two successive collisions T; ~ Aj;/u;. 
The systematic motion with velocity u leads to a transport of charge in the 
direction of the field. The current density can be written in the form 


à nye? 
j= Dy nem; => ie ri) E 
l 


Thus, to within a numerical factor, 





2 
n;eiTi 


l 
o~ 


i 
m; 





The values of 7; will be found in Part VI of this book. 

We assume that the value of 7 is calculated or known from measurements. 
In weak fields and at relatively high pressures the electrical conductivity of a 
system consisting of electrons and randomly distributed ions has a constant 
value and does not depend on the applied field. The equilibrium velocity dis- 
tribution of the electrons is disturbed only to a small degree by the external 
field. 

For strong fields and low pressures the situation is different. Under the 
action of the applied field the electrons are accelerated (since the mean free 
path in a rarefied gas is sufficiently large) and acquire an energy which is 
considerably larger than the energy of thermal motion. 

On the other hand, in collisions with ions the electrons lose energy. A 
calculation shows that a velocity distribution of the electrons can be estab- 


§42 PLASMA IN STATIC FIELDS 607 


lished, for which the increase in the energy of the electrons as they are 
accelerated in the field is compensated for by energy losses in collisions. 
Such a distribution is not an equilibrium distribution. Nevertheless, since 
it does not change in time (i.e. is stationary), one can speak of an effective 
temperature of the electrons which is equal to their mean energy. It turns 
out to be of the order of magnitude of 

M; 


1 
1 ion \ 2 ; 
T ~t (Zien) Ae eE (42.3) 
el - el , E7 
k \ Ma 
(where Mion and mg, are the masses of the ion and electron) and very large in 
comparison with the temperature of the ions and neutral molecules. The 
electrical conductivity is 


l 
o~E2. (42.4) 
Let us now consider a plasma in constant electric and magnetic fields. As 
we shall see below, the magnetic field has a very important effect on the 
behaviour of the plasma. 
The reasoning of §23 can be applied to a system of electrons moving with 


a velocity v. Namely, when electrons are moving with respect to the mag- 
netic field an electric field of strength 


Dy E(x H) (42.5) 


is induced, where u is the magnetic permeability of the medium (the plasma). 
In the presence of an electric field Ea current 


j= o( E+E yng) = o(E+ E (XID) = 0E +“ (vXH) (42.6) 


arises in the system of electrons. Expressing o in terms of the quantities 
involved in it and noting that j = nąeV, we can rewrite this last formula in 
another form: 


j=0E+ a§XH), (42.7) 


where 








608 MATTER IN THE PLASMA STATE Ch. 6 


_ CHTel _ WHHTel 
Me H 








(42.8) 


and wy is the cyclotron frequency for the electrons. 

It should be noted that formula (42.7) also remains valid when one takes 
into account the velocity distribution of the electrons since this has an effect 
only on the numerical coefficient a. 

Let us consider two cases: 

(1) the electric field parallel to the magnetic field, E||H, 
(2) the electric field perpendicular to the magnetic field, ELH. 

In the first case the magnetic field does not directly affect the current 
density j = gEso that j X H=0. 

The second case is of great interest. To find the vector j from (42.7) we 
first form the scalar product of (42.7) and H and then the vector product of 
(42.7) and H. We then have 


j: H= o(E-H) + a((jXH):H] = + a[GXH)-H] , 
or, since (jX H)-H = j-(HX H) = 0, we obtain 

j-H=0. (42.9) 
Further, 

jX H= 0(EXH) +a[(jXH)XH] = 

= 0(EXH) + wH(j-H) —ajH? = o(EXH)~—ajH2. (42.10) 

Whence, substituting (42.10) into (42.7), we find 

j=oE+ ao(EXH) — &?H?j 
or 


. o oQ 


= —— E+ 
l+a2H2 Ib snes 


Formula (42.11) shows that in the presence of the magnetic field the 
current is no longer parallel to the electric field. Ohm’s law in its ordinary 
form does not hold. 


§42 PLASMA IN STATIC FIELDS 609 


From formula (42.11) it follows that in the direction of the electric field 
the current density is 


o 
jy = —— E=04E. (42.12) 
Osama acl 


The electrical conductivity oj of the plasma in the case of HLE turns out to 
be smaller by a factor of 1 +&?H? than the electrical conductivity in the 
absence of the magnetic field. 

Besides the current in the direction of the electric field there arises a 
current j} in the direction perpendicular to E and H; this current is equal 
in absolute value to 


orna Ole (42.13) 


The current j} is called the Hall current, and oj = aja is called the Hall 
conductivity. For aff = urw,; > 1 the Hall current can substantially exceed 
the ordinary current. The condition Teąwp > 1 can be written in the obvious 
form 


eH _X 


TAIO TR I 

where R = mcv/eH is the radius of the circle described by an electron in the 
magnetic field H. The condition 7,,w;, > | is fulfilled in a strong magnetic 
field and a rarefied plasma. 

The effect of the magnetic field on the current in the plasma and, in 
particular, the appearance of the current component j} has a simple meaning. 
It is associated with the character of motion of a charge in crossed electric 
and magnetic fields as described in §39 of Part I. In such fields particles 
undergo a drift whose velocity and direction are determined by formula 
(39.15) of Part I. 

The motion of charges in the direction perpendicular to E and Hleads to 
the appearance of a current with the density jj. When the reverse inequality 
wT, <1 is fulfilled the effect of the magnetic field on the conductivity 
becomes small. 

We shall not dwell on the differences in the above expressions associated 
with the motion of positive ions. Since the mass of the ions is large, the in- 
equality wtjo, <1 is usually fulfilled for them and the magnetic field has no 
significant effect on the ion part of the electrical conductivity. 





— 


610 MATTER IN THE PLASMA STATE Ch. 6 
§43. Magnetic isolation and the pinch effect 


Electric and magnetic forces acting on a plasma give rise to important 
mechanical effects. Conversely, a spatial displacement of the plasma has a 
significant effect on its behaviour in electromagnetic fields. 

In order to consider the mutual effect of the field and the mechanical 
motion of the plasma we shall write the general equations which describe the 
properties of the plasma. 

The electromagnetic field in the plasma is described by Maxwell’s equa- 
tions. The equations of motion of the plasma are the equations of hydro- 
dynamics *. Disregarding the viscosity of the plasma, one can write the 
equations of motion in the form 


dv _ 
ôo q7 F- VP. 


where F is the ponderomotive force acting on unit volume of the plasma, 59 
is the density and p is the pressure of the gas. 
If a current j is flowing in the plasma, then in the magnetic field 


=" (GXH). (43.1) 
Hence 
ava HG 
ôo q VP +. GXH). (43.2) 


In particular, for a motionless plasma (v=0) one can write the equation of 
hydrostatics: 


Vp =£ (XH). (43.3) 


We shall apply the equation of hydrostatics to the consideration of the 
important phenomenon of the magnetic isolation of a plasma. 


* See, for example, L.G.Loytsyanskii, Mekhanika zhidkostei i gazov (Mechanics of 
liquids and gases) (Gostekhizdat, Moscow, 1950); L.D.Landau and E.M.Lifshitz, 
Mcchanics of continuous media(Pergamon Press, Oxford, 1960). 


§43 MAGNETIC ISOLATION. PINCH EFFECT 611 


Let us consider the plasma in a magnetic field H perpendicular to the 
current vector j. Then from eq. (43.3) it is clear that Vp # 0 and the pressure 
of the plasma varies from point to point. Eq. (43.3) can be integrated if the 
current density is eliminated from it by means of Maxwell’s equations. We 
have from (43.3) and (14.3) 


Vp= E [((VXH)XH] . (43.4) 


By means of the vector equality (1.48) we have 
(V XH) X H=(H:V)H— VGH?). 


Hence 
Vp = WE MA 1 172 u -V 
pP 4n VGH ar (H-Y)H. (43.5) 


Let us choose the direction of the field as the x-axis and consider the 
particular case where the magnetic field strength does not vary in the 
longitudinal direction (i.e. ðH/ðx = 0) but can vary according to an arbi- 
trary law as a function of the coordinates y,z. In other words, we consider 
a field with strength H= H(y,z) which varies in space. In this case we have, 
obviously, 





(H-V)H= pO.) z) 


and formula (43.5) gives 


2 
| p p = const . (43.6) 


The quantity wH2/87 represents the magnetic pressure, i.e. the force acting on 
a unit area of an imaginary plane in the gas. 

Formula (43.6) shows that the total pressure of the plasma, made up of 
the magnetic pressure uH 2/87 and the gas pressure p, must remain constant 
in space. 

As an example, let the plasma which is in a magnetic field not be bounded 
by impenetrable walls. Then the relation (43.6) shows that the total pressure 





Se 





612 MATTER IN THE PLASMA STATE Ch. 6 


cannot drop to zero at any point in the plasma. In the region of space which 
is not occupied by the plasma the value of His larger than in the inner region 
occupied by the plasma. This means that the plasma cannot expand into a 
vacuum. The magnetic field isolates the plasma, replacing the impenetrable 
wall. 

Another important hydrostatic effect is the pinch effect or the pheno- 
menon of a plasma column. This phenomenon consists in a compression of 
the plasma by the magnetic field of the plasma current itself. 

Let the plasma be represented by a cylinder of radius R (we direct the 
axis of the cylinder along the z-direction) along which a current of density j 
is flowing. The magnetic field of the current produces a magnetic pressure 
which must be balanced by the pressure of the plasma. The pressure of the 
plasma is most simply found by assuming the current density j to be constant 
over its cross-section (i.e. assuming that j = jo forr <R, andj=0forr>R). 
An analogous result can be obtained in the general case by means of formula 
(43.5). Then in cylindrical coordinates (43.3) assumes the form 


dp __ Hy 


dp cet 


where the magnetic field Hy is expressed by formula (17.11). 
Integration gives 


P= po = Tip? /c? (P<R) , (43.7) 
p=0 (p>R). (43.8) 


Here po =nkT is the pressure and np is the density of the gas at the centre 
of the plasma cylinder. 

Formulae (43.7) and (43.8) show that the gas pressure and correspond- 
ingly the density of the gas at the centre are higher than at the periphery of 
the cylinder. The magnetic field of the plasma current compresses and 
maintains the plasma cylinder. The radius of the plasma cylinder has a con- 
stant value, and the release of Joule heat leads to its heating. 

The phenomenon of the self-constriction of a plasma cylinder, which is 
called the “pinch effect”, leads to the detachment of the plasma from the- 
walls of the container in which the gaseous discharge takes place and to the 
formation of a more or less slender plasma column. A simple spark or 
lightning discharge are examples of a plasma column. The formation and 
constriction of a plasma column is, naturally, of particularly great importance 
for large current densities. 


§44 MAGNETIC FIELD IN A MOVING PLASMA 613 


We have confined ourselves here to finding the pressure distribution only 
on the assumption that the behaviour of the plasma has a stationary charac- 
ter. In practice, the non-stationary motion of the plasma, which leads to 
oscillations of the plasma cylinder as a whole, and which can lead to the loss 
of stability and rupture, is of importance in bringing about the pinch effect. 

The study of the whole picture of non-stationary phenomena arising in the 
pinch effect is very complex *. 


§44. The magnetic field in a moving plasma 


In a number of important problems hydrodynamical effects associated 
with the macroscopic motion of the plasma play an essential role in the 
behaviour of the plasma. For the study of such effects it is necessary to 
formulate a system of equations for the electromagnetic field in a moving 
medium. 

On the basis of the results of § 23 one can write 


- 1 0B -1 
VXE=——2 + VX(e!voxB)- (44.1) 


Moreover, if the fields are varying sufficiently slowly in time and if the dis- 
placement current can be disregarded, the magnetic field distribution is de- 
fined by eqs. (22.5). By means of Ohm’s law we write (44.1) in the form 


BF = — e(VXE) + VX(VoXB) = — e(VX07lj) #VX(VoXB). (44:2) 


Eliminating the current density from (44.2) and (22.5), we have 


0B nets c2 
ðt 4nou 





VX(VXB)+ V X(v)XB) é 


* For problems concerning the plasma state of matter see L.Spitzer, Physics of fully 
ionized gases (Interscience Publ., New York, 1956 and 1962); T.Cowling, Magnetohydro- 
dynamics (Interscience Publ., New York, 1957); H.Alfvén, Cosmical electrodynamics 
(Clarendon Press, Oxford, 1953); collection of papers, Upravlycemye termoyadernye 
reaktsii (Controlled thermonuclear reactions), (Atomizdat, Moscow, 1960); L.A.Artsi- 
movich, Upravlyaemye termoyadernye reaktsii (Controlled thermonuclear reactions), 
(Fizmatgiz, Moscow, 1961). 





614 MATTER IN THE PLASMA STATE Ch. 6 
According to (1.50) and taking into account (22.5), we find finally 


D e 
ðt 4nou 





V?B + VX(¥)XB) . (44.3) 


Comparing (44.3) with (30.1) we see that, in contrast to the case of a 
motionless medium, eq. (44.3) contains the term Y X(voXB). In the absence 
of this term eq. (44.3) expresses the attenuation of the magnetic field in the 
conducting medium over the depth of the skin layer. 

We shall be more interested here in the case where the first term on the 
right-hand side of (44.3) can be neglected in comparison with the second 
term. For this it is necessary that the velocity of motion v, and the conduc- 
tivity o of the plasma be sufficiently large *. 

This situation is relatively difficult (although possible) to attain in labo- 
ratory conditions. However, it is just this case which is dealt with in studying 
phenomena occurring on a cosmic scale. Dropping the small term in eq. 
(44.3), one can rewrite it in the form 


ƏB _ 
a7 Y X(VoXB) . (44.4) 


Relation (44.4) has an important meaning. According to (22.6), eq. (44.4) 
means that the flux of magnetic induction through a closed contour, every 
point of which is moving along with the fluid, is constant in time. One can 
picture this condition in an obvious way by means of the lines of force of the 
magnetic field. The equality (44.4) means that the field lines are moving to- 


„gether with the bulk of the plasma, as if frozen in to it. 


Let us consider a closed “‘fluid contour”, i.e. a closed contour connecting 
particles of the fluid, each of which is moving along its own hydrodynamical 
flow line. From (44.4) it follows that the number of lines of the magnetic 
field which cross the fluid contour remains constant. The fluid particles 
move as if sliding along the field lines and do not intersect them in the trans- 
verse direction. 

Let us now consider the motion of the plasma perpendicular to the mag- 
netic field. One can picture the peculiarities of such a motion most simply by 
considering a simple case. At the initial instant of time let the plasma be at 
rest and then set in motion with the velocity profile shown by arrows in 


* A more precise formulation can be found in the monographs of Spitzer or Cowling 
which we have already mentioned. 














§44 MAGNETIC FIELD IN A MOVING PLASMA 615 

HH 

ti 

iat 

La 

— 

rain 

+ v, 

[al 

id 

Hl 

A) 

(a) 

Fig. IV.19 


fig. IV.19. The lines of magnetic field strength in the plasma at rest are shown 
in fig. IV.19a by dashed lines. The motion of the plasma “carrying along” the 
lines gives them the form which is shown in fig. 1V.19b by dashed lines. 

In the plasma at rest the magnetic field has the strength HO). In the 
moving plasma besides the field component H, the component Hy, 5 O arises. 
It is easily shown that as the lines of the TOARE field are deformed its 
strength increases. 

Let the equation of a field line deformed by the motion be y(x). We 
assume that the bending of the field line is small. Then we can write 


whence 
RORA 
Ay Hy x: 


Let us find the change in the energy of the magnetic field under such a 
deformation of the field lines. In order to make the formulae to be obtained 
clearer we shall relate this energy to one line of the magnetic field. For this 
we note that, by definition, H lines of the magnetic field pass through unit 
area perpendicular to the field. If 


gr 





= lee 2 ex, LE (0))\2 avas = 
oe || dx dS. eal eee )? dvds “ih 


Sf a 


— 


616 MATTER IN THE PLASMA STATE Ch. 6 


is the total initial energy of the field, then the energy per unit area is U/S = 
(u/8n)(H)? f ax and, correspondingly, the energy per line is equal to 


After the deformation this quantity can be written in the form 


vipa [Sunia] ~ 











80H 
=i arr fe) a 
=v ce 
The increase in the energy of the magnetic field per line is equal to 
Aw =w—wo = SE dx . (44.5) 


This increase in the energy is due to the work done by the moving fluid 
against the elastic force of resistance of the field line. 

It is interesting to compare the expression obtained with the potential 
energy of a deformed elastic string. If the tension of the string is denoted by 
a, then the latter quantity can be written in the form 


AU=a f {[1+(dy/dx)2] 7—1} dx , 


1 
where [1+(dy/dx)?]2—1 is the geometrical elongation of the string under 
deformation. Assuming the deflection to be small, we have * 


AU ~ 4a f (dy/dx)? dx . (44.6) 


The comparison of (44.6) with (44.5) shows that the line of magnetic 
field strength in the plasma behaves as a string with effective tension 


* See, for example, A.N.Tikhonov and A.A.Samarskii, Partial differential equations 
of mathematical physics (Holden Day, San Francisco, 1964). 


§45 MAGNETOHYDRODYNAMIC WAVES 617 
opp = MHL 4r . (44.7) 


If the deformation of the lines of the field cannot be considered as small, 
the expression derived above for the increase in the energy of the field turns 
out to be inadequate. However, the general result remains valid: the deforma- 
tion and elongation of the lines of magnetic field strength which are carried 
along by the moving fluid correspond to an increase in the strength of the 
field. 

Thus, the motion of the conducting liquid can, in principle, lead to the 
generation and intensification of the magnetic field. On the other hand, if the 
liquid is placed in a sufficiently strong magnetic field, then this field can 
hinder the motion of the fluid, which is as if it attached to the magnetic 
field. The magnetic field also hampers the transition from laminar motion 
of a conducting fluid to turbulent motion. 

It turns out that these results can be checked directly in laboratory 
experiments. 

Certain important consequences of the properties of the “frozen-in”? mag- 
netic field described above will be considered in the next section. 


§45. Magnetohydrodynamic waves 


The analogy between the properties of the elastic string and the lines of 
the magnetic field naturally suggests the idea of the possibility of appearance 
of oscillations of the magnetic field about a certain equilibrium configuration. 

Let us consider a plasma fluid in a magnetic field of strength Hg. We 
assume that the field H, is uniform and -constant in time. Let an infinites- 
imally small perturbation in the form of a field with velocity v arise in the 
fluid. We assume that the conductivity of the plasma is infinitely large, so 
that the motion of the plasma fluid completely carries along the lines of 
magnetic field strength. 

Then in the velocity field the strength of the magnetic field can be written 
in the form 


H=H) th, (45.1) 
where h= h(r,¢) is an infinitesimally small perturbation. Substituting (45.1) 


into eq. (44.4) and disregarding the product of the infinitesimally small 
quantities, we have 








618 MATTER IN THE PLASMA STATE Ch. 6 


an = V X(vXHy) = (HoV) v. (45.2) 


Here we have made use of formula (1.45) and the condition of incompressi- 
bility of the fluid: 


V-v=0. (45.3) 


The equation of motion of the fluid (43.2) can be simplified, noting that 
in our case of an infinitesimally small velocity one can write 


dy _ dv 


ðv 
“dt Bara iNOS 


ðt 


and that the ponderomotive force can be written, as in deriving (43.5), in 
the form 


F=" (XD) = 7 [((V XH)xH] =E [((VXh)XHg] , 


with an accuracy to within the second order of small quantities. According to 
formula (1.47) we have 


(VXh) X Hy = (Hy: V) h— V(Hy-h), 


taking into account that Ho is a constant vector. Hence the equation of 
motion of the field (43.2) reads: 


Ov ( aioe) u 
i AN | lnc ; 
p aF Ear (Ho Y)h. (45.4) 
Without restricting the general character of our reasoning, we can choose 
the direction of the unperturbed field Hp as. the x-axis. Then eqs. (45.2) 
and (45.4) can be rewritten in the form 





oh ov 

at ehh ax’ ES) 
av_ HCH, w) u dh 

55 y (o Te + re Hy Ox (45.6) 


§45 MAGNETOHYDRODYNAMIC WAVES 619 


It is easy to show that the solution of the system of eqs. (45.5) and (45.6) 
represents a system of plane waves propagating along the x-axis (in the 
direction of the unperturbed field). 

From conditions (45.3) and (45.4) it follows that such waves must be 
transverse. Choosing the vector h as the y-axis and taking components of 
eqs. (45.5) and (45.6) in the coordinate axes, we have 


h, =0, (45.7) 
Olin ayn Ody (45.8) 
ðt 0 ax 

Ae 00 (45.9) 
ae Sl), (45.10) 
ee ee (45.11) 
v =0. (45.12) 


Here we have assumed that the pressure also depends only on the coordinate 
x and that the vector Vp has neither a y-component nor a z-component. 

Differentiating (45.8) with respect to ¢ and taking into account (45.11), 
we obtain 


02h, a2h 
21 =o. (45.13) 
Ox- co ðt 





Analogously, differentiating (45.11) with respect to ¢ and taking into account 
(45.8), we have 








32v, | 92, 
== == -= =0, (45.14) 
2 2 2 
Ox GEA ðt 
where 
uH? 
2A =e 
ca 4716 (45.15) 





merger eT 


620 MATTER IN THE PLASMA STATE Ch. 6 


Eqs. (45.13) and (45.14) show that the perturbation which arose in the 
plasma propagates in the form of plane waves along the magnetic field Ho 
with a constant velocity Cm determined by formula (45.15): 


Vy =a elkx-w0) , (45.16) 
hy =b el(Kx-cot) , (45.17) 


These waves are called the Alfvén magnetohydrodynamic waves. 

We assume for brevity of notation that these waves are monochromatic 
and that they are propagated in the positive direction of the x-axis. An out- 
standing property of magnetohydrodynamic waves is the fact that they 
propagate in an incompressible fluid. 

As is well known, only sound waves, associated with a change in the den- 
sity of the medium, can propagate in a non-conducting fluid. 

The transverse character of magnetohydrodynamic waves is clear from 
the foregoing. The velocity of their propagation is determined by the proper- 
ties of the medium and the strength of the constant magnetic field Hg. The 
relation between the amplitude of the magnetic field and the velocity is 
obtained by substituting (45.16) and (45.17) into the initial equations: 


1 
= | 2 2 


It goes without saying that besides the magnetic field there is also an 
electric field in the magnetohydrodynamic wave, defined by the Maxwell 
equation (4.13): 


niu 


Hou ‘ 
PA= Ihy| a ellkx-wt) , 
c(476 g)? 


It is useful to note that magnetohydrodynamic waves can be obtained from 
obvious considerations associated with the analogy between the lines of 
magnetic field and an elastic string. As is well known *, the equation of 
motion of the string has the form 


* See, for example, A.N.Tikhonov and A.A.Samarskii, Partial differential equations 
of mathematical physics (Holden Day, San Francisco, 1964). 


§45 MAGNETOHYDRODYNAMIC WAVES 621 
O25 ended? he 
ax2c? ar? 


where ¢ is the transverse displacement, and c is the velocity of propagation of 
waves along the string equal to c = (a/5,))2. If one substitutes for a its effec- 
tive value (44.7), then c is the same as the velocity of propagation of magneto- 
hydrodynamic waves. 

Magnetohydrodynamic waves are essentially a special case of electromag- 
netic waves in a conducting medium. In the presence of a sufficiently strong 
magnetic field in the conducting fluid a strong damping of electromagnetic 
waves occurs for propagation in all directions except in the direction of the 
magnetic field. Thus, the conducting fluid then has a strongly pronounced 
anisotropy for electromagnetic properties. 

If the velocity of propagation of magnetohydrodynamic waves is written 
in the form Cm = Ceff/(EH)? where Cog is the velocity replacing the velocity of 
light in the analogous formula, then the dielectric constant of the conducting 
fluid turns out to be 


For Copp © 100 cm/sec, which corresponds to the values of the field Ho = 300 
gauss and to 6g © I, e is of the order of magnitude of 3X 1018, 

Magnetohydrodynamic waves, as do all electromagnetic waves, transport 
energy. The energy flux transported propagates with a velocity Cm which, for 
a sufficiently strong field Ho, is very large in comparison with the velocity 
of motion |v| of the matter in a plasma fluid. 

Thus, in contrast to an ordinary non-conducting fluid, for a plasma in a 
magnetic field Hy there is always a mechanism ensuring a rapid transport of 
the energy of a perturbation arising in the plasma. 

The properties of the plasma state described above determine the behav- 
iour of matter at high temperatures, when atoms are to a considerable degree 
ionized. Hence the study of the plasma state is important, on the one hand, 
for astrophysics and, on the other hand, for investigations in the field of 
controlled thermonuclear reactions. 

We cannot in this book discuss the vast fields of investigation mentioned 
above, and for this we refer the reader to the specialized literature. 








622 MATTER IN THE PLASMA STATE Ch. 6 
§46. Plasma in a high-frequency electric field 


In considering the behaviour of plasma in stationary and quasistationary 
fields, we have disregarded displacement currents and have assumed a plasma 
to be a homogeneous conducting fluid. This approximation of magneto- 
hydrodynamics turns out, however, to be unsatisfactory in the realm of high- 
frequency processes. 

The substantial difference between the masses of electrons and heavy ions 
has an important effect in high-frequency fields. In this case the so-called 
two-fluid model often proves to be a sufficiently good approximation. 

In the two-fluid model approximation (it would be more appropriate to 
call it the gas-mixture model) the ions and electrons are assumed to be two 
ideal gases moving independently of each other under the action of corre- 
sponding forces. 

For electrons and ions separately one can write equations of motion 
— the equations of hydrodynamics: 


mn =—Vp +enE. (46.1) 


All the quantities, the charge, the mass, and the number of particles n per 
cm3, are referred to the electrons or the ions. For brevity in the notation we 
drop the index characterizing the type of particle. The continuity equation 
is also written separately for electrons and ions 


on aa 
ar Vimy =O: (46.2) 


We assume the pressure in the plasma for each kind of particle to be equal to 
the gas pressure 


p=nkT. (46.3) 


Finally, we assume that the oscillations performed by the charges under the 
action of the field Eare undamped. 

The problem of damping will be considered in detail in Part VI, where it 
will be shown that in practice dissipative processes, which play an important 
part in a number of phenomena, occur in plasma. However, here we can dis- 
regard dissipative processes and assume the oscillations to be adiabatic, so that 
the pressure and density are interrelated by the equation of an adiabatic curve. 


§46 PLASMA IN A HIGH-FREQUENCY FIELD 623 


Taking this into account, one can write for Vp in (46.1) 


so that 


dv __ (9p 
mn, eae VntenE. 


We assume that the field E, the field of the electromagnetic wave in the 
plasma, varies according to the law E~ exp [i(k-r—wr)] . Also, we assume that 
the charges perform small oscillations under the action of the field. The 
changes in the density of the plasma which occur in this case are small, and 
one can write that 


' ' 
= ors <€ 
N=Nytn, , n Sno» 


where ng is the density of the plasma in the absence of an external field. 
We look for a solution of the above system of equations in the form 


u~n' ~p' ~exp [i(k 1—w)] , (46.5) 


disregarding the squares of these quantities. We then have 


an' ba: 
rapt no(V-v)=0, (46.6) 
mno x =— Gale Vn' +engE. (46.7) 


Substituting the exponential functions into (46.6) and (46.7), we find 
n tO k 46.8 
i hea (k-v) , (46.8) 


yaick + (æ) KV) (46.9) 


mw \ðnjşs mw? ` 


If the direction of the vector k is chosen as the z-axis, then for an iso- 





624 MATTER IN THE PLASMA STATE Ch. 6 


tropic plasma the velocity vector can be written in the form v= (vj,V)), 
where vj is the component of the vector in the direction of propagation of 
the wave, and v, is the velocity vector in the xy-plane. From (46.9) we have 





v= a 
m (46.10) 
1e. 2 
oa (æ) ue 
mo ðn) 5 mw? 
or 
ieE. 
H oo (46.11) 


ye 5 
I mx 1—k?/ngysmw) 


where ys = no} (an/ap), according to (31.10) of Part III, is the coefficient of 
adiabatic compressibility. 

Knowing the velocity acquired by the charge in the field of the wave, we 
can write the mean current density in the form 


j= Deny. 


This summation is carried out over all kinds of particles. Taking into account 
(46.10) and (46.11), we obtain 


. e?n 
MA ra B (46.12) 


we?ng Ez 








(46.13) 


ja A, 


Making use of the general formula (31.12), we find for the components of the 
dielectric permeability tensor 


TEDD 


mw?(1 —k? |ngysmw?) 


2 
4nre no 





(46.14) 


> 


mw? 


2 
4re No 





q)=1— 22 


(46.15) 
mw? (1—k2/ngysmw?) 





— 


§46 PLASMA IN A HIGH-FREQUENCY FIELD 625 


We see that the phenomenon of spatial dispersion takes place in an isotropic 
plasma: its dielectric properties are described by the tensor €,; which depends 
on k and w. The values cj and ej in the direction of propaeation and in the 
perpendicular direction are different. 
We can now employ the above formulae to find a concrete law of disper- 
sion in a plasma, making use of the dispersion equations (46.16) and (46.17). 
We introduce, first of ali, the following important quantities: 


eh? 4re?ng 


(w a (46.16) 


which is called the plasma frequency or the Langmuir frequency for electrons, 
and 


ion 


à 4ne2n 
fony2_ — 0, (46.17) 


(wE 

Mion 
which represents the ion plasma frequency and is smaller than the electron 
plasma frequency by the ratio m/mj,,- For a typical concentration of elec- 
trons in the plasma (n§}~1015 cme) wt!~6xX 1011 sec!) the dielectric con- 


stant of the plasma always turns out to be smaller than unity. We then have 
l 
(w)? (wil)? 
pelle ra 
w 


2 
C wr? 





i (46.18) 
w2 





Ell = an . (46.19) 
2(1-K? /ntly smu?) wÊ? (1—k? /nl0-y cmon?) 
Taking into account these values, (46.16) gives 
27 (wp)? 

GF (- |. k2=0; 

c2 w? 
hence 

w2 = (whl)? + c?k2 . (46.20) 


Formula (46.20) defines the law of dispersion of transverse electromagnetic 
waves in a plasma. For w > wi there corresponds the propagation of two 
waves with different polarizations to each frequency. 





626 MATTER IN THE PLASMA STATE Ch. 6 


For w < wil the values of the wave number k turn out to be imaginary. 
This means that the waves undergo a damping similar to the ordinary skin 
effect. The damping coefficient is equal to 


(wf!) 


w2 





Let us now consider a longitudinal wave in the plasma. If the frequency is 
large in comparison with the ion plasma frequency, w> wp", then from 


(33.18) and (46.19) we find 





(wp)? 
Gl ~ 1 SS yey ae 
w*(1— [ysr mo ) 
Hence 
2 
o2 (o + M. (46.21) 


el 
Ysng’” 


From formula (46.21) it follows thai the frequency of longitudinal waves is 
always close to the plasma frequency. They can propagate only if w > wel. 
Longitudinal waves in a plasma have a simple meaning. In our approximation, 
where we have assumed that wr > 0, i.e. that Mion 7 ®, heavy ions are at 
rest. In equilibrium the electrons are distributed relative to the ions in the 
form of electron clouds which have been considered in §41. When the 
equilibrium is violated the electrons displace themselves relative to the ions at 
rest and perform oscillations with a plasma frequency ae The second term 
is associated with the waves of the adiabatic compression of the gaseous 
plasma. These waves are similar to sound waves, but, in contrast to the gas 
of neutral particles, the compression and rarefaction are accompanied by the 
separation of charges. 

The value of yç can easily be found from the equation of the adiabatic 
curve p/n* = const, if it is assumed that k = 3. The value of the exponent of 
the adiabatic curve corresponds to one degree of freedom, motion in the 
direction of propagation of the wave. Then 1/ygngm = 3v2, so that 


co? = (w9!)? + 3u2K2 , (46.22) 


= ih 5 n 
where (v2)? is the root-mean-square velocity of thermal motion of the parti- 
cles. As the wavelength decreases the second term of (46.22) increases. How- 


§46 PLASMA IN A HIGH-FREQUENCY FIELD 627 


ever, for v2k2 ~ ofl, which, as is easily seen, corresponds to \~x7!, the 
phase velocity of the waves turns out to be comparable with the thermal 
velocity of the electrons. In this case a strong damping of the longitudinal 
plasma waves occurs. For w< oe as is seen from (46.21), the values of k 
turn out to be imaginary, which corresponds to the damping of the waves. 
The penetration, as is seen from (46.21) turns out to be equal to the Debye 
length. Thus the plasma oscillations have frequencies which are close to the 
plasma frequency Ons Waves with all other frequencies either do not 
penetrate into the plasma or are rapidly damped. 

We shall not dwell on the waves corresponding to the oscillations of ions *. 

Let us now consider phenomena occurring in a plasma placed in an exter- 
nal magnetic field. In order to avoid complicated calculations, we shall 
restrict ourselves to the case of waves whose direction of propagation coin- 
cides with the direction of the external field Ho. Setting u ~ 1, instead of 
(46.1) we can write 


dv 
mno eng [Etc (vX Ho)] — Vp. 


We assume that the external magnetic field is large compared with the field of 
the electromagnetic wave. 

For what follows we shall need to take the polarization of the wave into 
account. We shall assume it to be circularly polarized, so that 


E= A(e, tie) exp [i(kz-%ft)] , 


where the unit vectors e; and e} are directed along the x-axis and y-axis 
respectively, and the direction of propagation is chosen as the z-axis. Re- 
producing all the previous calculations, we easily obtain, instead of (46.10), 


eE 


yp ee 2 
vyj=i m( atu) L. (46.23) 


* See, for example, Voprosy teorii plazmy (Problems of the theory of plasmas) No. 3, 
(Moscow, 1963); V.D.Shafranov, Elektromagnitnye volny v plazme (Electromagnetic 
waves in plasma) V.L.Ginzburg, Propagation of electromagnetic waves in plasma (North- 
Holland Publ. Co., Amsterdam, 1961); in which the whole range of problems concerning 
electromagnetic processes in plasma is discussed in detail. 





628 MATTER IN THE PLASMA STATE Ch. 6 


where wç is the cyclotron frequency eH) /me. Correspondingly, instead of 
(46.18) we find 


(of)? 


eye (46.24) 


Ey =i 


We see that the values of the dielectric constant ej or the refractive index 
n= ej turn.out to be dependent on the sense of polarization. To the right- 
handed (+ sign) and left-handed (— sign) polarizations there correspond 
different values of the refractive index. In other words, n has different values 
depending on the relation between the direction of rotation of the vector E 
in the wave and the motion of the electron in a circular orbit in the magnetic 
field. This phenomenon is called double refraction. The phenomenon of 
double refraction is characteristic of anisotropic media (crystals). We see that 
the plasma assumes anisotropic high-frequency properties in external mag- 
netic fields. 

When w ~ wç the refractive index for a wave with left-handed polariza- 
tion becomes very large, which corresponds to the reflection of the wave 
from the plasma, whereas a wave with right-handed polarization still pene- 
trates into the plasma. 





APPENDIX IV 


Important Integrals 


Stirling’s Formula 


Calculation of certain integrals 
co 


1l)J= ff eax? dx (the Poisson integral), 


T=2 | c=" dx =— | ev" dr. 
I vat 


We have the identity 


4 co 2 co 2 4 co oo 2 2 
ai f e dr f eng du=* f f e UTED dt du. 
a a 
0 0 0 0 
Introducing polar coordinates 
PaB 
u 
y= arctan 7 , 
dtdu =r dr dy, 


629 





630 APPENDIX IV 


we have 
7/2 co 
a Je rarde= ma. 
hence 
T= V/rJa. 
2) T =f en Ox? y?n dx , 
— o0 


Differentiating / with respect to the parameter a, we find 


co 
—ax2 Foy 
Iby= ff emacs x? dx=1Jn/a3, 
— 00 
co 2 — 
I, =f OTE x4 dx = 2 Vro, 
—co 


co 


—ax2 (2n — 1)(2n — 3)... 5-3-1 
= Axa 2N = 2n+1 
Ly, ff g x^" dx Ja T/a : 





— 00 


co 
Be) 
3) Been | eeatt, 
0 


rp 2 
=f e °° x dx = 1/2a. 
0 





APPENDIX IV 


Differentiating /; with respect to the parameter a, we obtain 





co 2 ; 

= —ax* _2n+1 3 m. 

n e x dha tes 
0 2a” 








4) Hf In(1 —e~*)dx =x In(l—e-*)I” — f Za 
0 0 0 ex —] 
co 
2o if x dx 
§ el 
Analogously, 
oo 





co 3 dy 
f xin’ —e>*) dx = 4x3 in e-*)| fZ ei 
0 0 exe al 





wl 


pr x3 dx 
J 


eee 


The calculation of the last integrals is carried out by means of the expansion 
of the integrand in a power series: 

















co co 
1 = DD e7 ¥e7”"xX = D e7 (nt1)x ’ 
Gaal n=0 n=0 
Hence it follows that 

co co co co 2 
Í S o Eaa L m2 TE 
0 ex —] n=0 0 n=0 (n+ 1)? 6 
J > dxo S f Bete ax= SS ae 
o e—! n=00 


nzo (nt1)* 90 15` 





632 APPENDIX IV 


5) I= P A 
-f Betta, J Zee x? e-* dx 
o (+1)? g (*+1)? 


Expanding the integrand in a power series, we have 


co 
T= 2 [ x? (e™ —2e7?* + 3e73* — ...) dx = 
0 


De) 
l = ( loaa EE =, 
! 22 32 12 3 





Stirling’s formula 
Stirling’s formula 
1 
N! = NN e™™ (27N)3 


holds for large values of the number N. It is obtained from a simple calcula- 
tion to an accuracy within the factor V2r. Namely, 


N 
In N! = yy Inn, 
n=1 
and further, according to the Euler—Maclaurin formula, 
N N 
2D a Inx dx +} Inxl) +C=NInN-N+LInN+C. 


n=1 


l A more accurate calculation leads to the value C = V2r. 


SUBJECT INDEX 


Absorption, 538, 543 

— coefficient, 535,551 

Absorptive capacity, 358 

Acoustic branch, 255 

Activity, 298 

—,grand partition function as a function 

of, 317 

Addition of probabilities, 23 

Additivity, 100 

— of the entropy, 111 

Adiabatic compressibility, 135, 136 

Adsorbed particles, chemical potential 
of, 332 

— —, free energy of, 332 

Adsorbent, 331 

Adsorption, gas, 331 

— isotherm, 333 

Alfvén magnetohydrodynamic waves, 
620 

Alternating current, 495 

— — machine, 521 

— —,Ohm’s law for, 500 

Angular momentum, intrinsic, 470 

Anisotropic media, 417 

Anode region, 596 

Anomalous dispersion, region of, 554 

Antiferromagnet, 417, 466 

Area of quantum state in phase space, 
12 

Arithmetical mean value, 25 

Asymmetry of second law and asym- 
metry of initial condition, 157 

Attenuation of correlation at infinity, 36 

Attraction, internal pressure due to, 228 

Autocorrelation function of the random 
e.m.f., 526 


Babinet’s principle, 564 


633 


Band, energy, 18 

Basic potential, 129 

— thermodynamic equality, 109 

— — inequality, 117 

Binary distribution function, 231 

— — —,determination of, 240 

— — —,energy in terms of, 238 

— — —, equation of state in terms of, 
237 

Binomial law, 32 

Biot—Savart law, 462 

Black-body, absolute, 359 

— — radiation, classical theory of, 360 

— — —, free energy of, 370 

Bohr quantum condition, 7, 11 

Boltzmann constant, 118 

— distribution in a plasma, 598 

— formula, 113 

Bose—Einstein distribution, 350 

Boundary conditions, 413 

at the interface of two dielectrics, 

424 

— —,Dirichlet’s, 428 

— —,Neumann’s, 428 

on current, 454 

on electric field, 414 

on magnetic field, 415 

Bound charges, current density of, 408 

Bound energy, 126 

Box, three-dimensional, particle in, 12 

Bragg’s conditions, 569 

Bremsstrahlung, 585 

Brownian force, 278 

— motion, 276 

— —,experiments on, 281 

— —, quantitative theory of, 278 

— particle, mean square displacement 

of, 279 





ues 


634 SUBJECT INDEX 


Brownian particles, number of, in the 
viewing field, 282 


Canonical distribution, 81 

— —, properties of, 84 

Capacitance, 426 

—, electrostatic, 458 

Carnot cycle, 121 

Cartesian coordinates, Laplace’s equa- 
tion in, 428 

Cathode region, 596 

Causality, 540 

Cells, method of, in phase space, 346 

Central limit theorem, 31 

Cerenkov counter, 592 

— loss, 589, 590 

— Vavilov radiation, 585, 592 

Characteristic function of probability 
distribution, 34 

— temperature for rotation, 209 

— — for vibration, 206 

— — of polyatomic molecules, 216 

Charge conservation law, 398 

— density in a plasma, 599, 601 

— —, mean, 403 

—, free, 406 

Chemical compounds, thermodynamic 
potentials of, 144 

— equilibrium, 335 

— — condition, 337 

— potential, 301, 302 

— — and phase transition, 309 

— — ofa crystal, 304 

— — of adsorbed particles, 332 

— — of an ideal gas, 303 

— — of photon gas, 368 

— — of the electron gas, 390 

Clapeyron—Clausius equation, 309 

Classical approximation, Gibbs distribu- 
tion in, 87 

— mechanics, transition from quantum 

mechanics to, 9 

— physics, magnetic moment in, 468 

— statistics, 86 

— —, applicability of, 353 

Closed system, 111 

—,gntropy for irreversible process in, 

122 


— —, entropy for reversible process in, 
122 

— —, fluctuations in, 271 

Coefficient of mutual inductance, 501 

— — — — for non-linear conductors, 

510 

— — — — of linear conductors, 503 

— — — — of two solenoids, 504 

self-inductance, 501, 507 

— — — for non-linear conductors, 510 

— — — of toroidal solenoid, 508 

— — — of two coaxial cylinders, 508 

Collision, energy conservation law for, 

45 

—, number of, 62 

— of molecules, 59 

—, particle, 44 

Complementary screen, 564 

Compressibility, adiabatic, 135, 136 

—, —, coefficient of, 624 

—, isothermal, 135 

Condensation, 321 

Conditions of equilibrium between two 

phases, 305 

Conduction current density, 405 

Conductivity, electrical, 405, 452 

—, —, tensor, 549 

—, Hall, 609 

— ina plasma, 606 

Conductor, 400 

—, electromagnetic energy of, 435 

—, electrostatic field in, 424 

—, ideal, 577 

—, interaction between, 519 

—, linear, 455 

—, moving, 512, 517 

—, —, force on, 517 

—, —, energy of system of, 512 

—, surface charge density for, 425 

—, surface potential of, 426 

Configuration integral, 222 

—, probability of, 230 

— space, 5 

Constitutive equations, 410 

Container wall, 48 

Coordinates, normal, 254 

Correlation function, 35, 524 

— — at infinity, attenuation of, 36 


SUBJECT INDEX 635 
— — ina plasma, 602 —, two coaxial, coefficient of self- 
— — method, 230 inductance of, 508 
Critical angle for total reflection, 575 
— magnetic field of superconductor, 489 Damping coefficient for plasma waves, i 
— point, 321 626 7 
— —,conditions for, 322 Debye characteristic temperature, 261, 
— —,position of, 288 266 
— —,properties at, 324 — cut-off, 259 
Cross section, 61 — lenght, 601 


Crystal, atom distance in, 242 

—, chemical potential of, 304 

—,entropy of, 264 

—, equation of state of, 265 

—, free energy of, 264 

—, heat capacity in, quaantum-mechani- 
cal, 244 

—,— — of, 243, 263, 267 

—,mean energy of, 243 

—, one-dimensional, dispersion law in, 
247 

—, —, kinetic energy of, 252 

—, —, thermal motion in, 245 

—,—, total energy of, 253 

—, partition function of, 260 

—, thermodynamic functions of, 263 

—, three-dimensional, energy of, 256 

— with different masses, 254 

Curie point, 478 

— —,entropy at, 484 

— —,heat capacity at, 484 

— —, magnetic properties near, 483 

— Weiss law, 478 

Current, alternating, 495 

—, boundary conditions on, 454 

—, correlation function for, 525 

— density of bound charges, 408 

— —, conduction, 405 

— —,mean, 403, 409 

—, direct, magnetic fields of, 460 

—, energy of, 457 

—, fluctuation, 523 

— loops, moving, 496 

—, polarization, 407 

—,random, 524 

— sources, 452 

—, surface, in superconductor, 488 

Cyclic process, 98 

Cylinder in electrostatic field, 437 


Degeneracy, 13 

Degenerate gas, 353 

— —, condition for, 355 

— quantum state, 13 

— state, 190 

Density, fluctuation of, 286 

—, spectral, 37 

Determination of binary distribution 
function, 240 

Deviation, mean square, 28 

—,-— —,in normal distribution, 33 

Diamagnet, 412, 465, 467 

Diamagnetic susceptibility, 469 

Diatomic gas, 179 

— —,heat capacity for, 187 

— —,thermodynamic functions of, 200 

— molecule, 195 

— —, basic quantities of, 198 

— —, energy levels of, 195 

— —, Gibbs distribution for, 181 

— —,rotational motion of, 199 

— —,vibrational motion of, 197 

Dielectric, 400 

— constant, 411,449 

— —, longitudinal, 546 

— —, perpendicular, 546 

—, free energy of, 443 

— in electrostatic field, 437 

— permeability tensor, 624 

— permittivity, 411, 541 

— —,complex, 535 

— susceptibility, 407, 445 

—, thermodynamic potential of, 444 

—, two, boundary conditions at the 

interface of, 424 

Diffracted radiation, intensity of, 567 

Diffraction, 560 

—, Fraunhofer, 560 

—, X-ray, 568 





—— 


636 SUBJECT INDEX 


Diffuseness, zone of, 386 

—,— —,electrons in, 392 
Diffusion coefficient, 279 

Dipole moment, 445 

— —, induced, 450 

— — of agas, 447 

— — of a sphere, 440 

— — of a dielectric, 442 

— — of molecules, 449 

Dirichlet’s boundary conditions, 428 
Discrete energy levels, 8 

Disorder, molecular, 42 

Dispersion, 249, 543 

—, anomalous, region of, 554 

—, frequency, 419 

— law in conducting medium, 536 
— — in one-dimensional crystal, 247 
— — in quantum mechanics, 552 

— — in rarefied gas, 551 

— of light, 549 

— relation, 540, 543, 547 

—, spatial, 419, 544 

—, time, 419 

Displacement law, 365 

—,mean square, of Brownian particle, 

279 

Dissociation, degree of, 341 

—, law of mass action for, 340 

— of atoms, thermal, 340 

Distance between atoms in crystals, 242 
— between molecules, mean, 222 
Distribution, Bose—Einstein, 350 

—, canonical, 81 

—, —, properties of, 84 

—, Fermi—Dirac, 351, 382, 387 

— function, binary, 231 

— —,-—, determination of, 240 

— —,-—,energy in terms of, 238 

— —,-—, equation of state in terms of, 

237 

— —, differential equation for, 232 
— — in photon gas, 368 

— — of mth order, 231 

— —, ordinary, 231 

—, Gibbs, 75, 81, 232 

—, —, for diatomic molecules, 181 

—, —, for ideal gas, 167 

—, —, for one molecule, 89 


—,—, in the classical approximation, 87 

—, —, maximum in, 85, 93 

—, —, of monatomic gas, 91 

—,—, properties of, 84 

—, —, sharpness of maximum in, 94 

—,grand canonical, 297 

—,microcanonical, 75 

— modulus, 82 

—, Statistical, 68 

—, —, of system with variable number of 
particles, 295 

Double refraction, 628 

Dulong—Petit law, 243 

Dynamical law, 6 


Easy magnetization, direction of, 478 

Effective electrons, 392 

Efficiency, 121 

—, maximum, 122 

Eikonal, 556 

— equation, 557 

Electric displacement, 410 

— field, boundary conditions on, 414 

— — of a superconductor, 488 

— —, Static, plasma in, 605 

— susceptibility of a polar gas, 448 

Electro-caloric effect, 444 

Electrodynamics, macroscopic, 397 

Electromagnetic field, penetration dis- 
tance of, 575 

— potentials, 412 

— —, equations for, 412 

— radiation, energy of, 369 

— waves, group velocity of, 553 

— — ina medium, 533 

Electromotive force (see also e.m.f.) 

— —, impressed, 453, 456 

Electron, energy of, at absolute zero, 
383 

— gas, 355 

— — at absolute zero, 381 

— — at low temperatures, 385 

— —, chemical potential of, 390 

— —, energy of, 387 

— —,ground state of, 383 

— —, heat capacity of, 384, 390 

— —, interaction energy in, 385 

— —, pressure of, 384 


: , 


SUBJECT INDEX 637 


— —, thermal excitation at low tempera- 
ture in, 386 

— in the zone of diffuseness, 392 

Electrostatic capacitance, 458 

— field, energy relations in, 435 

— — ina medium, 422 

— — in conductors, 424 

— induction, 401 

Electrostatics, direct problem of, 427 

Electrostriction, 444 

Elementary excitations, 372 

— — energy spectrum, 374 

— —, number of, 377 

Elementary quantum states, equal prob- 

ability of, 72 
E.m.f. (see also electromotive force) 
—, autocorrelation function of the rand- 
om, 526 

—, random fluctuating, 523 

Emitted energy, 357 

Energy change in quasi-static process, 

105 

— conservation law, 420 

— — — for collision, 45 

— density of radiation, 357, 359, 364 

— distribution, 53 

—, electromagnetic, of conductor, 435 

—, equipartition of, 57 

— in terms of binary distribution func- 

tion, 238 

—, internal, 98 

—, kinetic, mean, 56 

—,—,of one-dimensional crystal, 252 

— level, 6 

— —, distribution of, in a solid body, 
265 

— — of diatomic molecule, 195 

—,mean, of a crystal, 243 

—,—, of a polyatomic molecule, 186 

—,—,of rotation, 211 

—, —, Of triatomic molecules, 185 

— of current, 457 

— — electron gas, 387 

— — electrons at absolute zero, 383 

— — ideal gas, 168 

— — magnetic field, 505 

— — plasma, 604 

— — quasi-independent systems, 67 


— subsystem, 68 

— — surface, 326 

— — system, 71 

— — system in a magnetic field, 470 

— — system of moving conductors, 512 

— — three-dimensional crystal, 256 

—, pressure versus, 58 

— relations in the electrostatic field, 435 

—, total, of one-dimensional crystal, 253 

—, vibrational, mean, 203 

Ensemble, statistical, 22 

Enthalpy, 129 

—, determination of, 142 

Entropy, 108 

—, additivity of, 111 

— at Curie point, 484 

— at low temperature, 147 

— change, 114 

— — in open system, 117 

—, determination of, 143 

— for irreversible process in closed sys- 
tem, 123 

— for reversible process in closed system, 
122 

— in phenomenological thermodynam- 
ics, 122 

— of crystal, 264 

— — an ideal gas, 169 

— — aquasi-closed system, 114 

— — HCI, 150 

— — liquid helium II, 378 

— — mixing, 171 

— — non-equilibrium system, 112 

— — normal state, 490 

— — radiation, 370 

— — superconducting state, 490 

— — surface, 326 

—, properties of, 110 

—, rotational, 212 

—, vibrational, 207 

Equal probability of elementary quan- 

tum states, 72 

Equation of continuity, 406 

— — state, 50, 225 

— — — in terms of the binary distribu- 

tion function, 237 
— — — ofa crystal, 265 
— — — ofan ideal gas, 169 





m. a er a 


638 SUBJECT INDEX 


Equilibrium, approach to, 114 
— between two phases, conditions of, 
305 

—, chemical, 335 

—, —, condition, 337 

— curve, equation of, 313 

— —,phase, 307 

— —, —, differential equation of, 308 

Equilibrium, phase, between drops and 
vapour, 329 

— plasma, 597 

— radiation, 356 

—, state of, 1 

— state of a reacting system, 336 

Equipartition, law of, 186 

— of energy, 57 

Equiphase surface, 558 

Ergodic hypothesis, 22, 72 

— systems, 72 

Euler—Maclaurin formula, 632 

Exclusion principle, Pauli, 17 

Expansion, thermal, at zero temperature, 
151 

—, —, coefficient of, 135 

Experimental determination of the mean 
free path, 64 

— verification of the Maxwell distribu- 

tion, 54 
External parameters, 101 


Faraday’s law of induction, 498 
Fermi—Dirac distribution, 351, 382, 387 
— energy, 384 

Ferroelectrics, 417 

Ferromagnet, 417, 465, 476 

—, magnetic susceptibility of, 478 

—, metastable states in, 481 

—, thermodynamic potential of, 479 
Fluctuation, 25, 154 

— and thermodynamics, 278 

—, change in potential energy in, 274 
— currents, 523 

— dissipation theorem, 526 

— inaclosed system, 271 

— in a quasi-closed system, 272 

—, intensity of, 276 

— of density, 286 

— — number of particles, 287 


— — temperature, 288 

— — volume at a constant temperature, 
284 

—, perpetual motion machine and, 161 

—, probability of, 274 

—, relative, 28 

Forbidden states, 73 

Force, generalized, 101 

Fraunhofer diffraction, 560 

Free charges, 406 

— energy, 126, 137 

— — as criterion of reversibility, 127 

— —,electric, 421 

— — of a crystal, 264 

— — — adsorbed particles, 332 

— — — an ideal gas, 168 

— — — black-body radiation, 370 

— — — dielectric, 443 

— — — liquid helium II, 376 

— — — magnetic field, 465 

— — — plasma, 604 

— — per unit surface, 326 

— —,rotational, 212 

— ~—,vibrational, 207 

— path, mean, 62 

— —,-—, experimental determination of, 

64 

— —, probability distribution of, 62 

Frequency dispersion, 419 

Fresnel formulae, 574 


Gas, density of, in gravitational field, 
176 

—, weight of, 177 

Gaseous discharge, 595 

— —,non-self-maintained, 595 

— —,self-maintained, 595 

Gaussian distribution, 31, 275 

Geometrical optics, 555 

— —,applicability of, 559 

Gibbs distribution, 75, 81, 232 

— — for diatomic molecules, 181 

for ideal gas, 167 

— — for one molecule, 89 

— — in the classical approximation, 87 

— —,maximum in, 85, 93 

— — of monatomic gas, 91 

— —, properties of, 84 


| 
| 


SUBJECT INDEX 639 


— —, sharpness of maximum in, 94 

— Helmholtz equations, 132 

— thermodynamic potential, 109, 126, 
138, 299 

— — — of magnetic field, 465 

Glow discharge, 595 

Grand canonical distribution, 297 

— partition function, 298 

— — — and phase transition, 317 

— — — asa function of the activity, 317 

— — —,zeros of, 318 

Gravitational field, density of gas in, 176 

— —,heat capacity of gas in, 177 

Ground state of electron gas, 383 

Group velocity in a wave guide, 583 

— — of electromagnetic waves, 553 


Hagen—Rubens formula, 576 

Hall conductivity, 609 

— current, 609 

Hamilton’s equations, 3 

Harmonic oscillator in quantum mecha- 
nics, 12 

— —, linear, 4 

Heat capacity, 130, 136, 187 

— — at Curie point, 484 

— — at low temperature, 147, 245 

— — due to internal motion, 268 

— — in crystal, quantum-mechanical, 

244 

— — of crystal, 243, 263, 267 

— — of diatomic gas, 187 

— — of electron gas, 384, 390 

— — of gas in gravitational field, 177 

— — of ideal gas, 168 

— — of lattice, 391 

— — of liquid helium II, 378 

— — of monatomic gas, 170 

— — of normal state, 491 

— — of polyatomic gas, 189 

— — of superconducting state, 491 

— — of system with two levels, 192 

— — of unit surface, 327 

, rotational, 211 

, vibrational, 203 

> engine, 118 

— — of the first kind, 119 

~ — of the second kind, 124 





—, reduced, 122 

Height distribution of molecules pos- 
sessing different masses, 179 

Helium II, liquid, entropy of, 378 

— —, —, free energy of, 376 

— —,—, heat capacity of, 378 

— —, —, properties of, 372 

— —, —, thermal conductivity of, 379 

— —, —, two-fluid model for, 381 

— —,-,viscosity of, 379 

Histogram, 54 

Hysteresis cycle, 477 


Ideal conductor, 577 

— gas, 39, 164 

— —,chemical potential of, 303 

— ~—,energy of, 168 

— —,entropy of, 169 

— —,equation of state of, 169 

— —, free energy of, 168 

— —, Gibbs distribution for, 167 

— —,heat capacity of, 168 

— — inuniform field, 174 

— — of particles with spin, 75 

— — plasma, 597 

— —, thermodynamic potential of, 170 

— monatomic gas, partition function of, 
164 

Identity of particles, 16, 166, 344 

Images, method of, in electrostatics, 430 

Impedance, 516 

Impressed electromotive force, 453, 456 

Induced dipole moment, 450 

— magnetic moment, 467, 469 

Inductance, mutual, coefficient of, 501 


—,—,— —,for non-linear conductors, 
$10 
—,—,— —,of linear conductors, 503 


—,—,— —,of two solenoids, 504 

—, self-, coefficient of, 501, 507 

—,—,— —,for non-linear conductors, 
$10 

—,—,— —,of toroidal solenoid, 508 

—,—,— —,of two coaxial cylinders, 508 

Induction, electrostatic, 401 

—, Faraday’s law of, 498 

Inertia, moment of, 181 


Initial condition, 40, 71 





neue ooo 


640 SUBJECT INDEX 


Initial condition, asymmetry of second 
law and asymmetry of, 157 

Intensity of fluctuations, 276 

Interacting molecules, partition function 
for, 225 

Interaction between conductors, 519 

— — quasi-independent systems, 67 

— — subsystem and reservoir, 77 

— — subsystems, 68 

— energy in electron gas, 385 

—, interatomic, 194 

—, intermolecular, 39, 220 

—, pair-, 223 

—, potential energy of, 220 

—,radius of, 221 

Interface between two phases, 305 

—, curvature of, 328 

Internal energy and mean energy, 99 

— motion, heat capacity due to, 268 

— —,partition function for, 201 

Inversion, method of, in electrostatics, 434 

— point, 146 

Irreversibility, criterion of, 115, 159 

—, macroscopic, microscopic reversibility 

versus, 155 

Irreversible process in closed system, 
entropy for, 123 

Isothermal compressibility, 135 

Isotope effect, 491 


Joule heat, 420 
— Thompson coefficient, 145 
— — process, 144 


Kirchhoff theorem, 357 
Kramers—Kronig formulae, 540, 543 


Langevin formula, 448 

Langmuir frequency, 625 

Laplace equations in Cartesian coordi- 
nates, 428 

— formula, 328 

— pressure, 328 

Latent heat in phase transition, 307 

— — of phase transformation, 312 

— — of superconducting transition, 490 

Lattice, heat capacity of, 391 

Laue formula, 569 


Law of mass action, 338 

— — — — for dissociation, 340 

— — — —, Statistical meaning of, 338 
Light, dispersion of, 549 

— ray direction, 558 

— wave, monochromatic, 556 

Linear conductor, 500 

— —,magnetic field of current in, 462 
Liquids, statistical theory of, 239 
Longitudinal dielectric constant, 546 
— wave, 257,548 

— — in wave guide, 580 

— —,number of, 258 

Lorentz condition, 412 


Macroscopic electrodynamics, 397 

— system, 1 

Magnetic domains, 482 

— field, 410 

— —,boundary conditions on, 415 

— —,energy of, 505 

— —,energy of system in, 470 

— —, free energy of, 465 

— —, Gibbs thermodynamical potential 
of, 465 

— — in moving plasma, 613 

— — of current in linear conductor, 462 

— — of direct currents, 460 

— — of superconductor, critical, 489 

— —, plasma in, 607 

— induction in superconductor, 487 

— isolation, 610 

— moment, 464, 467 

— —,induced, 467, 469 

— — İn classical physics, 468 

— —,mean, 409, 468, 473 

— permeability, 411 

— pressure, 611 

— properties near Curie point, 483 

— susceptibility, 411, 465 

— — of ferromagnet, 478 

Magnetization, easy, directions of, 478 

—, residual, 477 

—, spontaneous, 477 

—,work of, 520 

Magnetohydrodynamic waves, Alfvén, 

620 
— —, velocity of propagation of, 621 


SUBJECT INDEX 641 


Mass action, law of, 338 

— —,— —,for dissociation, 340 

— —,— —, statistical meaning of, 338 

—,reduced, 60 

Maximum in Gibbs distribution, 85, 93 

— — — —, sharpness of, 94 

— term, method of, 86 

Maxwell—Boltzmann distribution, 175 

— distribution, 44, 47, 175, 345 

— —,experimental verification of, 54 

— equations, 410 

— —,applicability of, 416 

— — for quasistationary fields, 499 

— —,integral form of, 412 

— Lorentz equations, 398 

— particles, statistical weight for, 346 

— relations, 131 

Mean absolute velocity, 56 

— energy, internal energy and, 99 

— free path, 62 

— — —,experimental determination of, 
64 

kinetic energy, 56 

square deviation, 28 

— — — in normal distribution, 33 

value, 25 

—, arithmetical, 25 

— fora physical quantity, 82 

—, Statistical, 26 

Measuring devices, sensitivity of, 290 

Metamagnetics, 466 

Metastable states in ferromagnet, 481 

Melting point, 324 

Microcanonical distribution, 75 

Microparticles, 1 

Microscopic reversibility, 74 

Mirror, sensitivity of suspended small, 

291 

Mixing, entropy of, 171 

Mobility, 280 

Molar fractions, 339 

Molecular disorder, 42 

Molecules, collisions of, 59 

Moment of inertia, 181 

— of nth order, 34 

Momentum distribution, 53 

— of particle, generalized, 466 

— space, 5 


l 


Monatomic gas, Gibbs distribution of, 91 
Monochromatic light wave, 556 

— plane waves, 534 

Multiplication of probabilities, 24 


Nernst heat theorem, 148 

Neumann’s boundary conditions, 428 

Noise, 523 

Non-cyclic process, maximum work in, 

124 

Non-degenerate gas, 353 

Non-equilibrium properties, 2 

— system at zero temperature, 151 

— —,entropy of, 112 

Non-local connection, 544 

Normal coordinates, 254 

— distribution, 31 

— —,mean square deviation in, 33 

— liquid, 381 

— oscillation of a crystal, 254 

— state, entropy of, 490 

— ~,heat capacity of, 491 

— —, thermodynamic potential of, 490 

Normalization of probability, 25 

N-particle system, relative fluctuation in, 

28 

Number density of states, 15 

— of particles, fluctuation of, 287 

—, variable, subsystems with, 294 

— — —,-—, Statistical distribution of sys- 
tem with, 295 

— — states, 14 

— — —, Taylor series for, 79 

— — — with given energy, 15 

Nyquist formulae, 527 


Ohm’s law, 452 

— — for an alternating current, 500 
— — in differential form, 405 

— — in generalized form, 453 
Open system, entropy change in, 117 
Optical branch, 256 

— density, 573 

— path length, 556 

Optics, geometrical, 555 

—, —, applicability of, 559 

Ordinary distribution function, 231 
Orthohydrogen, 213 





642 SUBJECT INDEX 


Pair-interaction, 223 

Parahydrogen, 213 

Paramagnet, 411, 465, 468 

—, partition function of, 472 

Paramagnetic susceptibility, 471, 473 

— —, verification of the theory of, 475 

Partition function, 81, 86, 137 

— — for interacting molecules, 225 

— — for the internal motion, 201 

for the translated motion, 201 

„grand, 298 

— — in terms of the pressure, 140 

— — of crystal, 260 

— — of ideal monatomic gas, 164 

— — of polyatomic molecule, 213 

— — of system in magnetic field, 467 

— —,rotational, 208 

— —, —,Of polyatomic molecule, 215 

— —,vibrational, 202 

— —,-,of polyatomic molecule, 216 

Pauli exclusion principle, 17 

Penetration distance of electromagnetic 
field, 575 

Periodicity, conditions of, 249 

Permeability, 544 

—, magnetic, 411 

—, —, of superconductor, 489 

— of plasma, 607 

— tensor, dielectric, 624 

Permittivity, dielectric, 411, 541 

—, —, complex, 535 

Perpendicular dielectric constant, 546 

Perpetual motion machine and fluctua- 
tions, 161 

— — — of the second kind, 119, 155 

Pinch effect, 612 

Phase equilibrium between drops and 
vapour, 329 

— — curve, 307 

— — —, differential equation of, 308 

— of substance, 305 

— space, 3 

— —,area of quantum state in, 12 

— —, volume element in, 5 

— — volume of quantum state, 13 

— trajectory, 4 

— transformation, latent heat of, 312 

— transition, chemical potentials and, 309 





— —,grand partition function and, 317 

— —, latent heat in, 307 

— — of the first kind, 316 

— — of the second kind, 316, 320, 372 

— —, superconducting, 489 

—, two, conditions of equilibrium be- 

tween, 305 

—, —, interface between, 305 

— velocity in a wave-guide, 583 

— volume, 14 

Phenomenological thermodynamics, en- 
tropy in, 122 

Phonons, 373 

Photon, angular momentum of, 366 

—, energy of, 366 

— gas, 355, 366 

— —, chemical potential of, 368 

— —,distribution function in, 368 

—,momentum of, 366 

Physical quantity, mean value for, 82 

Planck constant, 7 

— formula, 363, 369 

Plasma, 596 

—, Boltzmann distribution in, 598 

—, charge density in, 599, 601 

—, conductivity in, 606 

—, correlation functions in, 602 

—, energy of, 604 

—, equilibrium, 597 

—, —, thermodynamics of, 604 

—, free energy of, 604 

— frequency, 625 

—, ideal gas, 597 

— in magnetic field, 607 

— in static electric field, 605 

— -like media, 545 

—, motion of, equations of, 610 

—, moving, magnetic field in, 613 

—, permeability of, 607 

—, pressure in, 622 

—, pressure of, 604 

—, screening in, 602 

— two-fluid model, 622 

Poisson—Boltzmann equation, 600 

— formula, 32 

— integral, 629 

Polar gas, electric susceptibility of, 448 

— liquids, 450 


SUBJECT INDEX 643 


— molecule, 446 

Polarization, 401, 442 

— coefficient, 407 

— current, 407 

—,dielectric, of solid, 450 

— energy loss, 585 

— loss, 589, 590 

— vector, 400 

— waves, 548 

Polyatomic gas, heat capacity for, 189 

— molecule, characteristic temperatures 
of, 216 

— —,mean energy of, 186 

— —, partition function of, 213 

— —,rotational motion of, 214 

— —,rotational partition function of, 

215 
— —,vibrational motion of, 215 
— —, vibrational partition function of, 
216 

Potential, electromagnetic, 412 

— ina plasma, 599 

—, thermodynamic, 109, 127 

—, —, Gibbs, 109, 126, 138, 299 

Poynting vector, 420 

Pressure, 48, 102 

— in plasma, 622 

—, internal, due to attraction, 228 

—,magnetic, 611 

— of electron gas, 384 

— of plasma, 604 

—, radiation, 370 

—, thermal coefficient of, 135 

— versus energy, 58 

Primitive lattice, 568 

Probabilities, addition of, 23 

—, multiplication of, 24 

Probability density, 22 

— — for the velocity, 52 

— distribution, characteristic function 
of, 34 

— — of free paths, 62 

—, equal a priori, 73 

— for the subsystem in the state with 
energy e, 80 

— ofa fluctuation, 274 

— of the ith state, 20 

—, normalization of, 25 


— with respect to the ensemble, 22 
Process, isothermal-isobaric, 127 
—, isothermal-isovolumic, 127 


Quantization condition, 10 

Quantized quantity, 6 

Quantum mechanics, harmonic oscillator 
in, 12 

— —, transition to classical mechanics 

from, 9 

— number, 7 

— particles, statistical weight for, 349 

— state, area of, in phase space, 12 

— —,degenerate, 13 

— —, phase space volume of, 13 

Quasi-classical approximation, 9 

Quasi-closed system, entropy of, 114 

— —, fluctuations in, 272 

Quasi-independent systems, 66 

— —,energy of, 67 

— —, interaction between, 67 

Quasi-static process, 103 

— —,demands on, 106 

— —,energy change in, 105 

Quasistationary field, 494 

— —,equations of, 495 

— —,Maxwell equations for, 499 

Quasistationarity, first condition of, 493 

—, second condition of, 494 

—, third condition of, 494 


Radiation, black-body, free energy of, 
370 

— by fast particles, 585 

—,energy density of, 357, 359, 364 

—,entropy of, 370 

— pressure, 370 

—, thermodynamic potential of, 370 

—, transitional, 593 

Radiative capacity, 358 

— — of an absolute black body, 359 

Radius of interaction, 221 

Raman scattering, 555 

Random function, 35 

Ray direction, light, 558 

Rayleigh—Jeans law, 361 

Reacting system, equilibrium states of, 
336 


/ 





644 SUBJECT INDEX 


Recovery time, 160 
Reduced heat, 122 
— mass, 60 
Reflected wave, 571 
— —,amplitude of, 574 
Reflection, angle of, 572 
—, internal, total, 576 
—, total, critical angle for, 575 
Reflectivity, 574 
Refracted wave, 571 
— —,amplitude of, 574 
Refraction, angle of, 572 
Refractive index, 535, 537, 551, 572 
— — in quantum mechanics, 552 
Relative motion, mean velocity of, 61 
— velocity distribution, 60 
Representative point, 3 
Reservoir, 70 
—, interaction between subsystem and, 
77 
—, subsystem in, 77 
Resistance, 456, 526 
Retardation, 493 
Reversibility, criterion of, 115, 159 
—, free energy as criterion of, 127 
—, microscopic, 74 
—,—, versus macroscopic irreversibility, 
155 
—, thermodynamic potential as criterion 
for, 128 
reversible process, 103 
— — inaclosed system, entropy for, 
122 
RLC-series circuit, 515 
Root-mean-square velocity, 59 
Rotational constant, 199 
— entropy, 212 
— free energy, 212 
— heat capacity, 211 
— motion of diatomic molecule, 199 
— — of polyatomic molecule, 214 
— partition function, 208 
— — — of polyatomic molecule, 215 
Rotation, characteristic temperature for, 
209 
—, energy of, 180, 181 
—,mean energy of, 211 
Rotor, 521 





Screening in a plasma, 602 

Second law, asymmetry of, and asym- 
metry of initial condition, 157 

Selection rules, 72 

Sensitivity of gas thermometer, 292 

— of spring-balance, 292 

— of suspended small mirror, 291 

— of measuring devices, 290 

Sites, 331 

Skewness, 34 

Skin depth, 530 

— effect, 528 

Solenoid, toroidal, coefficient of self- 
inductance of, 508 

—, two, coefficient of mutual inductance 

of, 504 

Spatial dispersion, 419, 544 

Spectral density, 37 

— function, 259 

Spectrum, 6 

Sphere, dipole moment of, 440 

— in electrostatic field, 438 

Spin, 16, 470 

—, orientation of, 75 

—, particles with, ideal gas of, 75 

Spring-balance, sensitivity of, 292 

Stability, condition of, 285, 289 

State of a system, 70 

—, degenerate, 13, 190 

— with energy e, probability for sub- 

system in, 80 
Statistical distribution, 68 
— — of system with variable number of 
particles, 295 

— ensemble, 22 

— independence, 24 

— integral, 88 

— law, 2,40 

— meaning of law of mass action, 338 

— mean value, 26 

— sum, 81 

— temperature, 81, 82, 117 

— theory of liquids, 239 

— thermodynamics, 99 

— weight, 13, 190 

— for Maxwell particles, 346 

— — for quantum particles, 349 

Statistics, classical, 86 


SUBJECT INDEX 645 


Stator, 521 

Stefan—Boltzmann law, 370 
Stirling’s formula, 632 

Stochastic process, 35 

Stokes’ formula, 280 

Sublimation curve, equation of, 313 
Subsystem, 68 

+ and reservoir, interaction between, 77 
Sum over states, 81 
Superconducting phase transition, 489 
— state, entropy of, 490 

— —,heat capacity of, 491 

— —, thermodynamic potential of, 490 
— transition, latent heat of, 490 

— — temperature, 486 
Superconductivity, 486 
Superconductor, 417 

—, critical magnetic field of, 489 

—, electric field of, 488 

—, magnetic induction in, 487 

—, magnetic permeability of, 489 
—, surface current in, 488 

—, thermodynamic potential of, 489 
Superfluidity, 379 

Superfluid liquid, 381 
Superposition approximation, 241 
Surface charge, 402, 403 

— — density for conductors, 425 
—, energy of, 326 

—, entropy of, 326 

—, free energy per unit, 326 

—, heat capacity of unit, 327 

— potential of conductor, 426 

— tension, 326 

Susceptibility, diamagnetic, 469 

—, dielectric, 407, 445 

—, magnetic, 411, 465 

—,—, of a ferromagnet, 478 

—, paramagnetic, 471, 473 

—, —, verification of theory of, 475 
System with two levels, 190 

— — — —, heat capacity of, 192 


Taylor series for number of states, 79 

Temperature, absolute, 50 

—, —, scale of, 141 

— coefficient of equilibrium pressure, 
310 


—, different, systems with, 83, 115 

—, fluctuations of, 288 

—, Statistical, 81,82, 117 

—,zero, non-equilibrium systems at, 151 

TE-wave, 581 

Thermal coefficient of pressure, 135 

— conductivity of liquid helium I, 379 

— death, 162 

— excitation at low temperature in elec- 

tron gas, 386 

— expansion at zero temperature, 151 

— —, coefficient of, 135 

— motion, 9 

— — ina one-dimensional crystal, 245 

Thermally isolated systems, 116 

Thermody namical law, 2 

Thermodynamic equality, basic, 109 

— — for systems with variable number 
of particles, 301 

— functions of a crystal, 263 

— — of diatomic gases, 200 

— inequality, basic, 117 

— potential as criterion for reversibility, 

128 

— —,Gibbs, 109, 126, 138, 299 

— —,-—,of magnetic field, 465 

— — ofa ferromagnet, 479 

— — of an ideal gas, 170 

— — of a superconductor, 489 

— — of chemical compounds, 144 

— — of dielectric, 444 

— — of radiation, 370 

— — of the normal state, 490 

— — of the superconducting state, 490 

— —, physical meaning of, 299 

— quantities, determination of, 137, 141 

— —, transformation of, 132 

Thermodynamics, 99 

—, first law of, 98, 104 

—, fluctuations and, 278 

— of equilibrium plasma, 604 

—, second law of, 98 

—, Statistical, 99 

—, third law of, 147, 151 

Thermomechanical effect, 381 

Thermometer, gas, sensitivity of, 292 

Third law of thermodynamics, 147, 151 

Three-body problem, 40 








646 SUBJECT INDEX 


Throttling, 144 

Time dispersion, 419 

TM-wave, 581 

Townsend discharge, 595 

Trajectory of individual molecule, 41 

Transformation of thermodynamic 
quantities, 132 

Transition from quantum mechanics to 
classical mechanics, 9 

Transitional radiation, 593 

Transition temperature, superconduct- 
ing, 486 

Transcritical state, 322 

Translated motion, partition function 
for, 201 

Translation, energy of, 180 

Transmissivity, 574 

Transparent medium, 537 

Transverse electric wave, 581 

— magnetic wave, 581 

— wave, 257, 548 

— — in wave guide, 580 

— —,number of, 258 

Triatomic molecules, 185 

— —, mean energy of, 185 

Two-fluid model for liquid helium II, 
381 

— — for plasma, 622 


Ultraviolet catastrophe, 362 
Uniform density, 42 
— field, ideal gas in, 174 


Vapour pressure, 314 

— — constant, 313 

Velocity, absolute, mean, 56 
distribution function, 43 
—, uniform, 42 

,mean, 55 

, most probable, 58 
of a wave, 248 


— of relative motion, mean, 61 

—, probability density for, 52 

—, root-mean-square, 59 

Vibration, characteristic temperature 
for, 206 

—, energy of, 180, 183 

Vibrational energy, mean, 203 

— entropy, 207 

— free energy, 207 

— heat capacity, 203 

— motion of diatomic molecule, 197 

— — of polyatomic molecule, 215 

— partition function, 202 

— — — of polyatomic molecule, 216 

Viewing field, number of Brownian par- 
ticles in, 282 

Viscosity of liquid helium II, 379 

Volume, fluctuations of, at constant 
temperature, 284 


Waals, van der — equation, 225, 226, 
229 

Wave, electromagnetic, in a medium, 
533 

— guide, 577 

— —,group velocity in, 583 

— —, phase velocity in, 583 

—, plane, monochromatic, 534 

—,velocity of, 248 

Weight of a gas, 177 

Wiener—Khinchin theorem, 38, 526 

Wien’s law, 364 

Work, 102 

—,maximum, in non-cyclic process, 124 

—, useful, 124 


X-ray diffraction, 568 
Zero, absolute, unattainability of, 149 


— point energy, 204 
Zeros of grand partition function, 318 





Translated from the Russian by 
S. Subotić, Belgrade 


Translation edited by 


(E ‘3 z: Schneps, Tufts University, Medford, Mass. 


ESE A. J. Manuel, Leeds University 
í 3 = 
P a 
+ 


Eg 


$ a) a 4 








Z PR F ` 


Theoretical Physics 


An Advanced Text 


Volume 3: 


QUANTUM MECHANICS 





Gager evi ch 
Benjamin G. Levich, V.A. Myamlin and Yu. A. Vdovin 





Institute of Electrochemistry ara 
WU Wot 
Academy of Sciences of the USSR, Moscow Ww 7 : a 


p 


1973 DET AN: 
North-Holland Publishing Company — Amsterdam + London 


Wiley Interscience Division 
John Wiley & Sons, Inc. — New York 


020497 


© NORTH-HOLLAND PUBLISHING COMPANY, 1973 


All rights reserved. No part of this book may be reproduced, stored in a retrieval 
system, or transmitted, in any form or by any means, electronic, mechanical, photo- 
copying, recording or otherwise without the prior permission of the Copyright owner. 


Library of Congress Catalog Card Number: 68 54501 


ISBN North-Holland, complete set: 0 7204 0176 3 
Vol. 3: 0 7204 0179 8 


Printed in The Netherlands 


Title of the Russian edition: 
KURS TEORETICHESKOJ FIZIKI 


Russian edition published by: 
IZDATELSTVO ‘NAUKA’, GLAVNAJA REDAKCIJA, 


FIZIKO-MATEMATICESKOJ LITERATURY (MOSKVA, 1971) 


Publishers: 
NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM 


Sole Distributors for the Western Hemisphere: 

WILEY INTERSCIENCE DIVISION 

JOHN WILEY & SONS, INC. - NEW YORK 
ISBN Wiley Interscience, Vol. 3: 0-471-53115-4 


FOREWORD 


The first Russian edition of ‘Theoretical Physics’, which appeared in 1962, 
has been widely used as a textbook. 

Numerous comments from colleagues, lecturers and students have been 
taken into account in preparing this new edition, which is the first one in 
English and which will also appear as the second Russian edition. 

The material has now been divided into 4 volumes covering the following 
subjects 


Volume 1 
PartI Theory of the Electromagnetic Field 
Part II Theory of Relativity 


Volume 2 
Part III Statistical Physics 
Part IV Electromagnetic Processes in Matter 


Volume 3 
Part V Quantum Mechanics 


Volume 4 
Part VI Quantum Statistics and Physical Kinetics 


The rapid development of physics and the present wide interest in 
non-equilibrium and non-stationary processes has compelled us to expand the 
section on physical kinetics. It has also been transferred to the end of 
Volume 4 as it is practically impossible to expound this topic without using 
quantum mechanics. 

Part IV — ‘Electromagnetic Processes in Matter’ — has been substantially 
revised. Interest in this field has increased recently, mainly in connection with 
the study of plasmas and plasma-like media, which now have sections devoted 
to them. 





vi FOREWORD 


The methods of calculating electrostatic and direct-current fields, and 
other problems of classical electrodynamics in a medium, are covered very 
briefly as we have assumed that students will be able to consult the many 
monographs and handbooks on general physics, electrical- and radio- 
technology, and the equations of mathematical physics. 

As for other modifications and additions, we should draw attention to the 
introduction of tensor notation, to new ideas in the theories of relativity and 
electromagnetic fields, the broadening of the introduction to the theory of 
probability, a brief presentation of the method of correlation functions in 
statistical physics, the exposition of the thermodynamic theory of ferro- 
magnetism and the theory of propagation of electromagnetic waves in plasma. 
A number of paragraphs have been rewritten. We have tried to bring the 
content of the book even closer to the interests of present-day theoretical 
physics. 

The general level of the book has been preserved and it is still intended to 
form an introduction to theoretical physics. Problems requiring the use of 
cumbersome or special mathematical apparatus are still excluded, and the 
most difficult sections are marked by an asterisk. These may be skipped at 
will, since there is no reference to them in the main text. 

In conclusion we would like to express our gratitude to all those who 
helped us in preparing this book, in particular to A.M. Brodsky, A.M. 
Golovin, B.M. Grafov, R.R. Dogonadze, V.S. Krylov and especially V.S. 
Markin and V.V. Tolmachev. I.V. Savelyev discovered a number of misprints 
which have now been corrected. 

L.D. Konkina helped us in editing the manuscript. 

We are grateful to the readers and students who used the first Russian 
edition of the book for sending us their valuable comments which have been 


taken into account in this edition. 
August 1970 


FOREWORD TO VOLUME 3 


This volume was written by Benjamin G. Levich in collaboration with Vic- 
tor A. Myamlin and Yurii A. Vdovin. Dr. Anatol I. Naumov is the author of 


Chapter 15, Fundamentals of the theory of elementary particles. The authors 
express their deep gratitude to Dr. Naumov for this valuable assistance. The 


complete volume was written under the supervision of B.G. Levich. 


October 1971 


FOREWORD TO THE FIRST RUSSIAN EDITION 


The continuous development of theoretical physics and the regular 
expansion of its areas of application create increasing demand for textbooks 
and manuals. 

The rapid development and the complexity of the most recent experi- 
mental methods of physical investigation, and the corresponding development 
and extension of the mathematical apparatus of theoretical physics, have 
meant that one man usually cannot combine the two methods of investiga- 
tion. The end of the 19th century and particularly the 20th century therefore 
saw physicists divided into ‘experimentalists’ and ‘theoreticians’, the latter 
studying physical laws by means of the mathematical methods of theoretical 
physics. 

Obviously, a background in theoretical physics is essential in the education 
of experimental as well as theoretical physicists. 

The experimental and theoretical methods of physical investigation have 
penetrated into a number of branches of science related to physics (physical 
chemistry, biophysics, geophysics, astrophysics, and so on) and into technolo- 
gy (metal physics and metallurgical science, thermophysics, electrical technol- 
ogy, radiotechnology, computation, the instrument-making industry etc.). 
Workers in these branches of science and technology also need a certain 
minimum knowledge of theoretical physics. 

The compilation of a modern textbook on theoretical physics is inevitably 
associated with certain logical and methodological difficulties. It is impossible 
at present to divide theoretical physics into classical and quantum parts so 
that it is also impossible to divide it into separate chapters and sections. For 
example, the exposition of statistical physics without taking into account the 
quantum properties of atomic systems is impossible, for it would mean that 
the general theory remained without practical application. In the theory of 
electromagnetic processes in matter one has of necessity to make use of the 
ideas of statistical physics, and so on. It may be that the maximum 
consistency of composition would be obtained if the book were founded on 


vii 








viii FOREWORD TO THE FIRST RUSSIAN EDITION 


quantum mechanics but this is completely inadmissible in a book intended as 
an introductory treatise. Quantum mechanics requires a certain preparedness 
and the student must be convinced of the necessity of renouncing obvious 
classical representations. Compromise solutions, which have justified them- 
selves during many years of teaching theoretical physics at the Moscow 
Engineering-Physical Institute and Moscow State University, are therefore 
inevitable. 

The following general principles have been applied. 

(1)The book is written as an introduction to theoretical physics so that 
aspects requiring the use of cumbersome or special mathematical apparatus 
have not been included. 

(2) As it is to be used for a systematic study of the subject the course is a 
unique whole and all material necessary for understanding the later sections is 
contained in the earlier ones. 

(3) It would not be feasible to elucidate experimental facts in addition to 
problems concerning purely theoretical physics. However, physics is a single 
science, and an attempt to expound the theoretical aspects without taking 
experiment into account would be quite wrong. The reader is assumed to 
have some basic experimental knowledge from university courses in general 
and atomic physics so that we have confined ourselves to references and, in a 
few instances, to a schematic description of basic experiments. 

(4) The acquaintance assumed with general courses in general and atomic 
physics has allowed us to rely on a certain (very restricted) knowledge of 
quantum mechanics in our treatment of statistical physics. 

(5) Classical mechanics usually forms a separate course so that this topic 
has been omitted although detailed reference has been made to handbooks of 
mechanics. 

(6) The book similarly does not cover hydrodynamics, aerodynamics, the 
theory of heat transfer, or problems related to electrical- and radio- 
technology. 

(7) Detailed reference is made to mathematical manuals. The mathematical 
apparatus utilized, except in the sections marked by an asterisk, is covered by 
the usual courses in analysis. In the case of quantum mechanics, however, the 
mathematical apparatus has been included, since it is of a specific character 
and is not taught in traditional mathematical courses. 

(8) As the book is intended as a systematic course in theoretical physics no 
attempt has been made to achieve the same level of accessibility in all 
sections. It is a well-known fact that a student’s comprehension and 
assimilation of difficult material increases as a course progresses, and that this 
is also true for the associated mathematical apparatus. Moreover, experi- 


FOREWORD TO THE FIRST RUSSIAN EDITION ix 


mental physicists will constantly encounter new problems in quantum 
mechanics which can only be handled using advanced methods of treatment. 
The section on quantum mechanics (Part V) therefore deals with some topics 
having a more advanced character than those in other sections. The analysis 
of applications of the kinetic equations is similarly treated rather extensively. 


The uniqueness of the book’s objectives has affected the content of individual 
sections, so that some topics in modern physics have been included at the 
expense of more traditional material. 

Part | contains the foundations of the theory of the electromagnetic field 
in a vacuum, based on the system of Maxwell-Lorentz equations. A basic 
knowledge of electromagnetism is assumed. The focus of attention is the 
theory of radiation and the motion of charged particles in external fields. 

In Part Il, devoted to the theory of relativity, a four-dimensional form of 
representation is adopted which not only corresponds to the spirit of the 
theory but also predominates in contemporary literature. The problems of 
dynamics in the theory of relativity are treated in some detail. A number of 
the most recent applications of the theory of relativity, particularly those 
related to nuclear physics, are covered here for the first time in a textbook. 

Part III is a revised version of Levich’s ‘Introduction to Statistical Physics’ 
and treats statistical physics and the fundamentals of statistical thermo- 
dynamics. Classical thermodynamics would require too much space, and did 
not seem indispensable. 

Part IV contains the theory of electromagnetic processes in matter. 
Relatively little attention is paid to problems in theoretical electrical- and 
radio-technology. The phenomenological theory of electric and magnetic 
properties of matter is analyzed in some detail, and the notion of the physics 
of the plasma state of matter is given. 

In Part V the basic ideas of present-day relativistic quantum mechanics are 
included as well as the traditional problems of non-relativistic quantum 
mechanics. Applications to solid-state theory are considered at length. 

Part VI contains the essential concepts of physical kinetics, which are not 
usually presented in a general course on theoretical physics. 


The experience of teaching theoretical physics shows that the greatest 
difficulties are often encountered not in understanding new physical ideas but 
in the actual mathematical treatments. All mathematical operations have 
therefore been performed in sufficient detail. 

For convenience we have presented a brief derivation of those formulae of 





x FOREWORD TO THE FIRST RUSSIAN EDITION 


vector analysis which are encountered throughout, as well as the necessary 
data on Fourier integrals and 5-function theory. 

The numbering of formulae and sections starts afresh in each Part and 
references to appendices have been given Roman numerals. 

The author hopes that the readers, after making themselves familiar with 
the foundations of theoretical physics expounded in this book, will be able to 
proceed to a more profound study using the many-volume treatise of Landau 
and Lifshitz. The scientific and educational ideas of their work were of great 
influence on the author, who is a disciple of Landau. 

Parts I—IV and Part VI were written by B.G. Levich. Part V was written by 
Y.A. Vdovin and V.A. Myamlin under the general scientific guidance of B.G. 
Levich. Chapter XV of Part V was written by A.I. Naumov. 

The author expresses his gratitude to the colleagues who read the book 
and the manuscripts, and made a number of valuable remarks: B.M. Grafov, 
R.R. Dogonadze, V.A. Kiryanov, V.S. Krylov, V.S. Markin, V.P. Smilga, Y.A. 
Chizmadzhey and Y.I. Yalamov. 

The creation of a textbook on theoretical physics sufficiently comprehen- 
sive in content and clear in presentation is a very complex task. The author is 
therefore conscious of the fact that shortcomings and errors will be discover- 
ed and would be grateful to receive an account of them which can be taken 
into consideration in the next edition of the book. 


1962 


Volume 1 


Part 1 
Chapter 1 
2 


Anf We 


Part II 


Chapter 1 
2 
3 


Theoretical Physics: 
Outline of Vols. 1—4 


Theory of the Electromagnetic Field 

General theory of the electromagnetic field 

The electrostatic field 

The quasistationary magnetic field 

The electromagnetic field of arbitrarily moving charges 

Radiation theory 

Electromagnetic field in a vacuum and electromagnetic wave 
scattering 

The motion of particles in electromagnetic fields 


Theory of Relativity 


General principles of the theory of relativity 
Relativistic mechanics 
Relativistic electrodynamics 


Appendix I, II and III 


Subject index 


Volume 2 


Part III 


Chapter | 
2 


Statistical Physics 


The basic concepts of the theory of probability 
The kinetic theory of gases 


xi 





xii 


OVLON AW 


— 


Part IV 
Chapter 1 


DnpW 


OUTLINE OF VOLUMES 1-4 


Statistical distribution 

Statistical and phenomenological thermodynamics 

Ideal gases 

Systems of interacting particles 

Crystals 

The theory of fluctuations 

Systems with a variable number of particles 

Statistical distributions in quantum statistics and some of their 


applications 


Electromagnetic Processes in Matter 


Electromagnetic fields in matter 


Electrostatics : 
Direct electric current and the magnetic properties of matter 


Quasistationary electromagnetic fields 
High-frequency fields 
Matter in the plasma state 


Appendix IV 


Subject index 


Volume 3 


Part V 
Chapter 1 


OD IUDUHAAWH 


10 
11 


(for details see p. xv) 


Quantum Mechanics 


The basic concepts of quantum mechanics 

The Schrödinger equation 

The mathematical apparatus of quantum mechanics 
Motion in a centrally symmetric field 

The quasi-classical approximation 

The matrix form of quantum mechanics 
Perturbation theory 

Spin and identity of particles 

Applications of quantum mechanisms to atomic and nuclear 
systems 

The theory of diatomic molecules 

Scattering theory 


12 
13 
14 
15 


OUTLINE OF VOLUMES 1-4 xiii 


The method of second quantization and radiation theory 
Relativistic quantum mechanics 

Some problems of quantum electrodynamics 
Fundamentals of the theory of elementary particles 


Subject index 


Volume 4 


Part VI 
Chapter 1 


NDARA UN 


Quantum Statistics and Physical Kinetics 


Quantum statistics 

Physical kinetics 

The kinetic theory of gases and gas-like systems 

The time correlation function method and Onsager’s theory 
Solid-state theory 

The kinetic properties of solids 

Interaction of radiation with a free-electron gas 


Subject index 





Part V 


Contents of Volume 


Quantum mechanics 


Chapter 1 The basic concepts of quantum mechanics 


§ 


1 
2 
3 
4 


nn 


The physical basis of quantum mechanics 

The wave function 

The principle of superposition. Expansion in plane waves 
Uncertainty relations and the relationship between quan: 
tum mechanics and classical mechanics 

The principle of causality in quantum mechanics 


Chapter 2 The Schrodinger equation 


6 
7 
8 
9 
10 
11 
12 
13 
14 


The Schrodinger wave equation 

The probability current density 

A particle in a one-dimensional rectangular potential well 

A particle in a three-dimensional rectangular potential well 
The quantum-mechanical oscillator 

The three-dimensional oscillator 

Reflection from and penetration through a potential barrier 
One-dimensional motion 

The Schrodinger equation for a system of particles 


Chapter 3 The mathematical apparatus of quantum mechanics 


15 
16 
17 
18 


Linear operators 

Eigenvalues and eigenfunctions of operators 

Hermitian operators 

The orthogonality and normalization of the eigenfunctions 
of Hermitian operators 

Expansion in terms of eigenfunctions 
Quantum-mechanical variables and operators 


XV 


3 


10 
16 


19 
23 


26 
30 
32 
36 
37 
43 


52 
55 


59 
63 
65 


67 
69 
71 





§21 


22 
23 
24 
25 
26 
27 
28 
29 
30 


31 
32 
33 
34 


CONTENTS 


The wave function and the probability of the results of 
measurements 

Mean values 

Commutation of operators 

Heisenberg inequalities 

Poisson brackets 

Coordinate and momentum operators and eigenfunctions 
The Hamiltonian operator 

Stationary states 

Integral form of the Schrödinger equation 

The eigenvalues and eigenfunctions of the angular momen- 
tum operator and of the operator of the square of the an- 
gular momentum 

Differentiation of operators with respect to time 
Constants of the motion 

Parity 

The uncertainty relation for time and energy 


Chapter 4 Motion in a centrally symmetric field 


35 
36 
37 
38 


The Schrodinger equation 

The free motion of a particle with given angular momentum 
The spherical well 

Motion in a Coulomb field 


Chapter 5 The quasi-classical approximation 


39 
40 


41 


42 
43 


The limiting transition to classical mechanics 

The solution of the Schrodinger equation near a turning 
point 

Motion in a potential well in the quasi-classical approxi- 
mation 

Potential barrier penetration 

Quasi-classical motion in a centrally symmetric field 


Chapter 6 The matrix form of quantum mechanics 


44 
45 
46 


Operators and matrices 

The fundamentals of matrix calculus 

Geometric interpretation of the wave function. Canonical 
transformations 


73 
75 
78 
82 
84 
86 
92 
94 
96 


101 
108 
111 
114 
117 


121 
130 
132 
135 


145 
150 
155 


158 
162 


164 
168 


176 


§47 


48 
49 


50 
51 
52 


Chapter 


53 
54 
55 
56 


57 
58 


CONTENTS 


The eigenfunctions and eigenvalues of an operator given 
in matrix form 
Continuous matrices. The Dirac notation 


The Schrödinger representation, the Heisenberg representa- 


tion and the interaction representation 

The linear harmonic oscillator 

The matrix elements of the angular momentum operator 
The addition of angular momenta 


7 Perturbation theory 


The theory of time-independent perturbations 
Perturbation theory in the presence of degeneracy 

The theory of time-dependent perturbations 

The transition of a system into new states under the ac- 
tion of perturbations 

The adiabatic theory of perturbations 

Perturbation theory in integral form 


Chapter 8 Spin and identity of particles 


59 
60 
61 


62 
63 


64 


65 


66 


67 


The spin of elementary particles 

Spin operators 

The eigenfunctions of the operators of the components of 
the spin of a particle. The rotation matrix 

The total angular momentum 

The Pauli equation. The probability current density vec- 
tor 

The identity of particles. The principle of identity of par- 
ticles. Symmetric and antisymmetric states 

Wave functions for systems of fermions and bosons. The 
Pauli principle 

The wave function of a system of two identical particles 
with spin A 

Exchange interaction and the concept of the chemical 
and strong nuclear interactions 


Chapter 9 Applications of quantum mechanics to atomic and 


68 


nuclear systems 


The helium atom 


xvii 


180 
183 


190 
195 
199 
204 


276 





xviii 


§69 
70 
71 
72 


73 
74 
75 
76 
77 


CONTENTS 


The variational principle 

The self-consistent field method (Hartree—Fock method) 
The statistical model of the atom 

The quantum numbers characterizing the states of elec- 
trons in atoms 

The periodic system of the elements 

The Zeeman effect 

The Paschen—Back effect and the diamagnetism of atoms 
Deuteron theory 

Nuclear-shell theory 


Chapter 10 The theory of diatomic molecules 


78 
79 
80 


81 
82 


The adiabatic approximation and the classification of elec- 


tron terms 

The hydrogen molecule. Ideas of the theory of chemical 
binding 

The interaction of atoms at large distances 

The comparison of molecular terms with atomic terms 
Rotation and vibration of diatomic molecules 


Chapter 11 Scattering theory 


83 
84 
85 
86 
87 


88 
89 
90 


91 


92 
93 


94 
95 
96 


Scattering amplitude and cross section 

The Born approximation 

The scattering of fast charged particles by atoms 

Partial wave scattering theory 

Scattering by a spherical potential well (the concept of 
resonance scattering) 

The elastic scattering of identical particles 

The effect of polarization in scattering processes 

The transition to the classical limit in the quantum scat- 
tering formulae 

The general theory of inelastic scattering and the absorp- 
tion of particles 

The diffraction of fast neutrons by nuclei 

The scattering of slow particles. The threshold approxi- 
mation 

The Breit—Wigner formula 

The scattering matrix (S-matrix) 

S-matrix and perturbation theory 


280 
282 
285 


293 
299 
307 
312 
314 
320 


323 


326 
334 
337 
339 


345 
350 
354 
357 


364 
368 
371 


378 


385 
391 


394 
396 
401 
408 


§ 97 
98 


Chapter 


99 
100 
101 
102 
103 
104 
105 
106 
107 
108 
109 


Chapter 
110 
111 


112 
113 
114 


115 
116 
117 
118 


119 
120 


121 
122 


123 


CONTENTS 


Analytic properties of the S-matrix 
Time reversal and the principle of detailed balance 


12 The method of second quantization and radiation 
theory 


Second quantization for systems of bosons and fermions 
The quantum mechanics of the photon 

The quantization of the radiation field 

The interaction of an electron with radiation 
The absorption and emission of light 

Dipole transitions in atomic systems 
Quadrupole and magnetic dipole radiation 
Selection rules 

The photoelectric effect 

The scattering of light by atoms 

The theory of the natural linewidth 


13 Relativistic quantum mechanics 


The relativistic wave equation for a particle of zero spin 
The charge density and probability current for particles 
of zero spin 

The concept of the nuclear force field 

The Dirac equation 

The probability density and probability current in Dirac’s 
theory 

The solution of the Dirac equation for a free particle 

The concept of the positron 

The spin of particles described by the Dirac equation 

The transition from the Dirac equation to the Pauli equa- 
tion. The magnetic moment of a particle 

The hydrogen atom in Dirac’s theory 

The invariance of the Dirac equation with respect to reflec- 
tion, rotation and Lorentz transformation of coordinates 
The laws of transformation of bilinear combinations made 
up of wave functions 

The concept of weak interactions. Parity non-conserva- 
tion 

Two-component neutrino theory. The universal four- 
fermion interaction 


XIX 


412 
418 


423 
431 
436 
439 
442 
446 
448 
449 
452 
456 
462 


468 
470 
473 
476 
482 
483 
487 
492 


494 
497 


499 


502 


504 


509 





XX CONTENTS 


Chapter 14 Some problems of quantum electrodynamics 


§124 The Green’s function of the Dirac equation 
125  Green’s function for a system of two particles 
126 Feynman diagrams 
127 The Compton effect 
128 The shift of the terms of the hydrogen atom under the 
action of the vacuum field (the Lamb shift) 


Chapter15 Fundamentals of the theory of elementary particles 


129 The classification and properties of elementary particles 
130 The types of interactions of elementary particles 

131 Symmetry groups in quantum mechanics 

132 The isogroup SU(2) and its representations 

133 Isomultiplets of elementary particles 

134 The wave functions of a system of nucleons and 7-mesons 
135 Isotopically invariant interaction 

136 The scattering of nucleons and 7-mesons 

137 The unitary group SU(3) and its representations 

138 The eightfold way formalism and unitary multiplets 

139 Some consequences of strict unitary symmetry 

140 Some aspects of violated unitary symmetry 

141 Composite models in the unitary symmetry scheme. Quarks 
142 General appraisal of unitary symmetry 


Subject index 


SLT 
524 
527 
$35 


543 


549 
554 
556 
562 
567 
373 
576 
580 
585 
590 
595 
601 
604 
609 


613 


PART V 


QUANTUM MECHANICS 











The Basic Concepts 
of Quantum Mechanics 


§1. The physical basis of quantum mechanics 


Quantum mechanics, like all physical theories, arose in close connection 
with the development of a new field of experimental investigations. These 
investigations, which began with the study of black-body radiation at the turn 
of the century, were soon extended to the phenomena of the photoelectric 
effect, and subsequently to atomic systems. In this book we cannot give a 
chronological account of the history of the development of new concepts 
about the character of atomic processes which resulted in the creation of 
contemporary quantum mechanics. We can only point out that this was an 
agonizing search which required a great effort by some of the most outstand- 
ing physicists of our century. The creation of quantum mechanics has 
undoubtedly been the greatest triumph of contemporary science. The diffi- 
culties which the development of quantum mechanics encountered were asso- 
ciated with the fact that the properties of the particles constituting atomic 
systems are fundamentally different from the properties of macroscopic 
bodies. The laws of classical mechanics and classical electrodynamics turned 
out to be inadequate for the description of the behaviour of individual mole- 
cules and atoms, as well as elementary particles — electrons, protons, neutrons 
and so on. Henceforth we shall designate elementary particles, and sometimes 





4 THE BASIC CONCEPTS Ch. | 


individual atoms and molecules, by the term microparticles. As we shall see in 
what follows, an outstanding feature of microparticles is the fact that their 
motion does not obey the laws of classical mechanics. To avoid confusion, we 
shall call particles whose motion does obey the laws of classical mechanics, 
corpuscles. We have earlier, in particular in the theory of the electromagnetic 
field and statistical physics, acquainted ourselves with a number of facts 
which indicate the inadequacy of classical concepts in the realm of atomic 
processes. Thus in statistical physics we have seen that the energy of indivi- 
dual atoms and molecules, i.e. the basic quantity characterizing their state, 


assumes discrete values. 
A direct proof of the discrete character of the states of atomic systems 


was provided by the experiments of Franck and Hertz (1913). As is known, 
in these experiments a beam of electrons of a given energy entered a container 
filled with a gas. As a function of the accelerating potential, the electron 
current through the gas displayed a number of sharp minima. The position of 
these minima is determined by the properties of the atoms of the gas. This 
dependence of the current on the potential can be interpreted in the follow- 
ing way. In colliding with the atoms, the electrons transfer energy to the 
former when the energy of the electrons has a value equal to the difference 
between the energies of two possible states of the atom. The atom then 
makes transition to a state with a higher energy, an excited state, and a mini- 
mum appears in the electron current. If the energies of the electrons have 
other values, they undergo only elastic collisions. Thus the atom, as a whole, 
can only obtain definite amounts of energy from outside. This means that the 
internal energy of the atom has only discrete values or, in other words, the 
atom possesses a discrete energy spectrum. The discrete character of the ener- 
gy states is related to the discrete character of atomic transitions. When the 
atom makes a transition from an excited state to a lower energy state a light 
quantum is emitted with an energy equal to the difference between the ener- 
gies of the two states. 

The energy of the atom is not the only quantity which can assume only 
discrete or, as it is said, quantum values. In the experiments of Stern and 
Gerlach it was shown that the angular momentum of an atom also possesses a 
discrete spectrum of values. In these experiments a beam of atoms was passed 
through a magnetic field Æ which is nonuniform but constant in direction. 
Choosing this direction as the z-axis, one can write the expression 1,(0% , /dz) 
for the force acting on the atom, where yu, is the projection of the magnetic 
moment of the atom, onto the direction of the field. If it is assumed that the 
theorem of the proportionality between the magnetic moment and the angular 
momentum (see Part I, §22) is valid for atoms, then it follows that the mean 


§1 THE PHYSICAL BASIS 5 


force is proportional to the value of Lz, where L, is the projection of the 
angular momentum of the atom onto the direction of the field (see however 
ch. 8). 

The experiment of Stern and Gerlach showed that the beam of atoms was 
deflected in the magnetic field, splitting into a number of separate beams. This 
means that the projection of the angular momentum of the atom onto the 
direction of the field can assume only discrete values. To every allowed value 
of L, there corresponds a definite value of the force and a corresponding 
magnitude of the deflection in the nonuniform magnetic field. Thus each of 
the beams produced contains atoms with a given value of the quantity L,. 

The discrete character of the allowed values of the basic quantities which 
characterize the states of atomic systems profoundly contradicts all the 
concepts of classical mechanics. It follows from the general propositions of 
classical mechanics that an infinitesimal force causes an infinitesimal change 
in the state of a system. Hence all mechanical quantities depending on the 
state of that system,such as the energy, the momentum and so on, are contin- 
uous functions of the state. The discreteness of the states and the discontin- 
uous changes of the states of microparticles directly contradicts this general 
principle. 

The difficulty in understanding the properties of microparticles is increased 
by the fact that, in addition to the discreteness of certain quantities which 
characterize the state of particles, in a number of experiments the continuous 
nature of the same quantities was clearly shown. Thus, for example, the 
bremsstrahlung of electrons in the nuclear field has a continuous spectrum, 
which is indicative of a continuous change in the energy of the emitting 
particles. 

It turned out that microparticles combine in a striking way the properties 
of ordinary particles (corpuscles) and the properties of waves. This fundamen- 
tal property of microparticles is called the wave—particle duality. 

The basic feature of corpuscles as studied in classical mechanics is that 
they have a definite spatial extent. The idealization of the corpuscle is a 
material point having no size and moving in a definite trajectory. 

The properties of wave motion in classical physics are to a certain extent 
the opposite of the properties of corpuscular objects. A monochromatic 
wave, first of all, has an infinite extent in space. Hence it makes no sense to 
state that ‘the monochromatic wave is located at a given point of space’. Also 
it is meaningless to speak of the trajectory of a monochromatic wave. The 
localization of a wave in space is inevitably associated with the production of 
a wave packet (see Part I, §35). The size of the wave packet is smaller, the 
larger the number of waves of different frequencies which form it. This prop- 





6 THE BASIC CONCEPTS Ch. 1 


erty of wave motion does not depend at all on their physical nature; it is 
valid for elastic, electromagnetic and other types of wave. Thus in classical 
physics localized corpuscles and waves delocalized in space are in a sense 
opposites. 

It turns out that a combination of corpuscular and wave properties, inex- 
plicable form the point of view of the ordinary ideas of classical physics, 
occurs in microparticles. More precisely, under certain conditions, microparti- 
cles behave as corpuscles, while under other conditions the same micropar- 
ticles display purely wave properties. Finally, in certain experiments both 
cospuscular and wave properties manifest themselves simultaneoulsy. 

The wave--particle duality of the properties of microparticles was first 
discovered in experiments with light quanta. The wave properties of the 
electromagnetic field are sufficiently well established. We note only that 
Newton’s corpuscular theory was able to compete succesfully with the wave 
theory in explaining phenomena such as the rectilinear propagation and 
refraction of light. However, this theory was completely abandoned after the 
discovery of interference, diffraction and birefringence. 

As to the corpuscular properties of the electromagnetic field, they are 
manifested in a particularly obvious way in the Compton effect (Part II, § 17; 
see also ch. 14). Indeed, this effect allows only the corpuscular interpretation. 
No considerations based on wave concepts can explain the appearance of 
recoil electrons: the incident electromagnetic wave cannot cause the motion 
of one of the atomic electrons without perturbing the motion of the remain- 
ing electrons. However, as we have seen in $17 of Part II, the theory based on 
the concept of the collision of two particles, the incident photon and the 
atomic electron, describes the process correctly. 

The corpuscular nature of light shows up in the same obvious way in the 
photoelectric effect, in the phenomenon of recoil when atoms are emitting 
radiation, etc. Thus the wave theory of light, which was successfully applied in 
considering a wide range of electromagnetic phenomena, turned out to be 
completely unsuitable for the explanation of a number of processes in which 
the corpuscular nature of light was manifested. 

The situation was characterized briefly by stating that there is a dualism in 
the properties of the electromagnetic field. Sometimes light manifests its wave 
nature, and sometimes it behaves as a flux of photons. 

The totality of experimental data showed that one must ascribe to every 
photon an energy Æ and a momentum p which are respectively equal to 


E=hw, (1.1) 
[Dae er R (1.2) 


§1 THE PHYSICAL BASIS 7 


where A is the Planck constant 4 divided by 27 and equal to 1.054X 10-27 
erg-sec, and X=A/2zm. Further, it turns out that the wave—particle duality 
appears not only in the case of photons, but for all microparticles. 

The corpuscular properties of microparticles were discovered a relatively 
long time ago. They show up particularly clearly in observations with cloud 
chambers. As is known, microparticles in passing through a cloud chamber 
filled with a saturated vapour produce ionization along their path. The ions 
produced by the microparticles become the centres of condensation which 
can be observed directly in the form of tracks. Similarly, particles in moving 
through a thick layer of photographic emulsion leave a photographic image, 
i.e. a track. All this led to the idea that microparticles move in well defined 
trajectories and are similar in their properties to ordinary corpuscles. How- 
ever, the experiments to be described below made it possible to establish that 
this was not so and that the wave—particle duality is a basic feature of all 
microparticles. But it should be stressed that the discovery of the wave prop- 
erties of electrons, protons and other microparticles was preceded by the 
development of the concepts of quantum mechanics, in which the existence 
of the wave properties of microparticles was predicted theoretically. 

Let us consider the following experiment. Individual electrons which have 
passed through a fixed accelerating field are let successively through a small 
opening in an impenetrable screen. After passing through the opening the 
electrons fall onto a photographic plate giving rise to blackening at the points 
of impact. If the electrons moved as corpuscles obeying the laws of classical 
mechanics and did not interact with the edge of the screen, then all of them 
would fall at the centre of the photographic plate, forming a black spot. In 
fact, the electrons must interact with the atoms of the screen. Since the latter 
are in thermal motion, this interaction has a random character. Hence it 
would be natural to expect the electrons to give rise to a blackening of the 
photographic plate similar to that caused by a beam of molecules coming out 
of a small opening. Namely, the number of electrons which are deflected from 
their rectilinear path and do not fall at the centre of the screen should depend 
on the magnitude of the deflection according to the normal law of errors. The 
intensity of the blackening, which is proportional to the number of electrons 
falling at a given point, should be expressed by the Gaussian distribution. 

In fact, nothing of the kind is observed in experiment. If a large number of 
electrons are succesively let through an opening, then the following is 
observed: 

(1) There are zones on the photographic plate — ‘forbidden’ zones — in which 


electrons never arrive. These zones have the character of concentric rings of 
definite width; 





8 THE BASIC CONCEPTS Chea 


(2) the zones in which electrons do arrive form a system of concentric rings 
alternating with the ‘forbidden’ rings. 

By carrying out the experiment for a sufficiently long time, i.e. letting 
through a sufficient number of electrons, one can obtain blackened rings 
which are identical with those which arise when light is diffracted from a 
circular opening. Such a diffraction pattern is shown on the right of fig. V.1. 
In the drawing, white rings correspond to the blackened rings of the photo- 
graphic plate. The curve of the intensity of electrons as a function of the 
angle of diffraction 3 is shown on the left. The same result is also obtained in 
another arrangement of the experiment. Instead of letting electrons through 
one by one, a beam of electrons can be directed through the opening of the 
screen. The beam must be of a sufficiently low intensity, so that the interac- 
tion between electrons will be of no importance. When the beam of electrons 
passes through the opening of the screen, the distribution of the intensity of 
blackening of the photographic plate immediately appears in the form of a 
diffraction pattern*. Thus the motion of each individual electron differs 
fundamentally from that of a classical particle passing through the opening of 
the screen. 








Fig. V.1 


At first sight it may seem that the results of the observations described 
could be interpreted in the following way: for some unknown reason, not all 
possible trajectories of the motion of the electrons, but only certain allowed 
ones, can be realized in nature. The sum of these allowed trajectories deter- 


* From the experimental point of view the second method is simpler. Such an experi- 
ment was carried out by Davisson and Germer in 1927, after the creation of the theory 
of quantum mechanics. The experiment with individual electrons was not carried out 
until 1948 by V.A. Fabrikant, L.M. Biberman and N. Sushkin. 


un 


THE PHYSICAL BASIS 9 


mines the loci of incidence of electrons on the photographic plate. However, 
other experiments show this interpretation to be incorrect. 

Let us consider an impenetrable screen with two openings. (The experi- 
ment discussed below is a schematization of a real experiment in which, 
instead of diffraction from a screen with two openings, the diffraction of 
electrons from a crystal lattice was observed.) If in turn each of the openings 
is covered, while individual electrons are successively let through the other, 
then after the passage of a large number of electrons the two diffraction 
patterns described above, with a central spot opposite to each of the open- 
ings, will arise on the photographic plate. We now uncover both openings and 
let electrons through. We assume that each of the electrons moves in a well 
defined allowed trajectory. Passing through one of the openings, the electron 
causes blackening at a definite point on the photographic plate. The final 
diffraction pattern produced by a large number of electrons should be a 
simple superposition of the intensities of the blackenings arising when elec- 
trons are let through one opening. In other words, we should obtain the same 
diffraction pattern as in the case of the alternative passage of electrons first 
through one opening and then through the other. In fact, however, the distri- 
bution of intensities of blackening is of a completely different character. The 
blackening of the photographic plate corresponds exactly to the pattern of 
diffraction from two openings. This means that there are no possible or 
allowed electron trajectories whatever. The electron, like a wave, possesses 
interference properties, and it would make no sense to try to establish 
through which one of the two uncovered openings a given electron ‘in reality” 
passed. 

We see that a certain wave motion is associated with the electron; the 
electron possesses wave properties. It is because of these wave properties that 
an individual electron passing through an opening can arrive at some regions 
of the photographic plate but not at others. In the passage through two 
openings the wave properties of an individual electron become apparent in the 
fact that its motion is affected by both openings. The allowed and forbidden 
regions of the photographie plate correspond to the dark and light zones of 
the diffraction pattern from two openings. 

However, it would be incorrect, on the basis of the aforesaid, to try to 
identify the electron with a particular wave. If this were possible, then the 
darkening of the photographic plate on which the diffracted wave (the elec- 
tron) falls would be a pale copy of the darkening produced by many elec- 
trons. An individual electron would immediately give the entire diffraction 
pattern. 


We have stressed that experiment shows an individual electron to be inci- 





10 THE BASIC CONCEPTS Ch. 1 


dent at a definite point of the plate, like an ordinary corpuscle. The differ- 
ence between an individual electron and a corpuscle lies in the fact that the 
loci of incidence on the photographic plate are determined by laws which are 
completely different from those determining the loci of incidence of a corpus- 
cle. Thus, as is shown by the diffraction experiment, wave properties are 
inherent in every individual electron, but they manifest themselves in an 
obvious way only in the repetition of a large number of identical experiments 
(i.e. the successive passage of a large number of electrons). 

We note that, although we have spoken above only of the electron, the 
same is also valid for other microparticles. Diffraction experiments have been 
carried out with neutrons, protons and other microparticles. 

The quantum-mechanical treatment of the diffraction experiments des- 
cribed will be given in the next section. Here we stress once more that the 
wave—particle duality which had already been established for light quanta 
also becomes apparent in diffraction experiments with electrons. 

Diffraction experiments make it possible to give an answer to the question: 
‘what is an electron — a wave or a corpuscle?’ Here we use the terms ‘wave’ 
and ‘corpuscle’ in their usual classical meaning. The answer which follows 
directly from the experiments described is that the electron is neither a wave 
(otherwise a single electron would give the entire diffraction pattern), nor a 
corpuscle moving in a definite trajectory (which contradicts the experiment 
with two openings). The electron is a microparticle possessing specific proper- 


ties. 


§2. The wave function 


The fact that the electron possesses wave properties shows that the elec- 
tron is to be compared with a certain wave field. We shall call the amplitude 
of this wave field, which depends on the spatial coordinates and time, the 
wave function W(x, y, Z, t). For brevity it is sometimes also called the y- 


function. 
The physical interpretation of the wave function (which was first given by 


M. Born) is the following: the quantity | W(x, y, z, )|2dV is proportional to 
the probability that the electron will be found at the instant of time ¢ in a 
volume element dV in the neighbourhood of the point x, y, z. 

Denoting this probability by dW, we have 


dW~ (W(x, y, z, O12dV. (2.1) 


This interpretation is based on the following reasoning. In the experiments 


un 
N 


THE WAVE FUNCTION 11 


with the passage of individual electrons through one or two openings we have 
seen that the locus of incidence of the electron on the photographic plate is 
to a certain degree random. The electrons will fall completely randomly at 
one or other point of the diffraction ring to be formed. Hence the behaviour 
of an electron must be characterized by a certain probability function. The 
intensity of blackening of the photographic plate at a given spot is propor- 
tional to the number of electrons which fall on this spot. On the other hand, 
it is clear that this probability function must be connected with the proper- 
ties of the wave field. Only in this case can the probabilistic character ofthe 
blackening of the photographic plate at a given spot be matched with the 
strict spatial distribution of the bands of blackening. That is, the random 
character of the incidence of the electron at a given point can be matched 
with its wave properties only by assuming that the probability of finding the 
electron at the given point is proportional to the intensity of the wave field 
|W|2. This relation is just that given by formula (2.1). 

The physical interpretation of the wave function given by formula (2.1) 
clearly shows that the wave field W(x, y, z, t) is fundamentally different from 
other wave fields known in classical physics. This is particularly apparent 
from the fact that only the quantity IYI? has a direct physical meaning. The 
wave function itself, in general, can be a complex quantity. Furthermore, the 
wave functions Y and Aw, where A is an arbitrary constant, correspond to 
one and the same physical state of the particle, since by virtue of the defini- 
tion (2.1) the two wave functions lead to one and the same space—time 
distribution of the probability of finding the particle. 

By virtue of the theorem of addition of probabilities (see §2 of Part III) 
the definition (2.1) can be supplemented by the following normalization 
condition: 


f\venyz, nHi2dV=1, (2.2) 


where the integral on the left, taken over all space, is the probability of find- 
ing the particle at time ¢ at any point of space. This probability is naturally 
equal to unity. Wave functions y satisfying the normalization condition are 
said to be normalized. For normalized wave functions the relation (2.1) can 
be rewritten in the form 


dW = (W(x, y, z, I2dV = p(x, y, z, t) dV, (2.3) 


where p(x, y, Z, t) is the probability density. The probability W(V, £) of find- 
ing the particle in a given finite volume V at an instant of time f, is, according 


12 THE BASIC CONCEPTS Ch. 1 


to the theorem of addition of probabilities 
my, t)=f aw= five, y. z, D?ar. (2.4) 
Vv y 


The condition (2.2) cannot be satisfied in the case where the integral f|W|2dV 
is divergent. This can occur, in particular, if the square of the modulus of the 
wave function, |W|?, does not tend to zero at infinity. Physically this means 
that there is a finite probability of finding the particle at every point of space. 
In §18 it will be shown how the normalization of the wave function is to be 
carried out in this case. 

We note that the wave function normalized by the condition (2.2) is 
defined with an accuracy to within the factor e!“, where a is any real number, 
in view of the equality Je!@|? = 1. 

In addition to the wave function of a single microparticle it is also neces- 
sary to introduce the idea of the wave function of a system of microparticles, 
Let there be a system of N particles interacting with each other according to 
an arbitrary law. This system of particles can be represented by the wave 


function 
W(X], Vy, 24, X2, V2 Z} -XVii Zp XN YNZ.) » 


where / is the index of the particle. 
In the further construction of quantum mechanics we shall proceed from 


the assumption that there is no difference of principle between the descrip- 
tion of an individual microparticle and a system of microparticles and that the 
interpretation of W(x, y, z, t) and W(%1,¥1,21.%2.¥2,22, -5 Xy YN: Zy, O) 
must be one and the same. In other words, the physical meaning of the wave 
function of a system of N particles lies in the fact that the quantity 


dW ~ [W(ry, 19, .... ty, OI2dVdV2...dViy (2.5) 


gives the probability that at a certain instant of time ż the first particle be 
found in the volume element dV, surrounding the point r,, the second parti- 
cle in the volume element dl’, surrounding the point r3, and so on. For brevi- 
ty, r; here denotes the totality of coordinates (x; y;.z;). We note that on the 
basis of the theorem of addition of probabilities the quantity 


dW, ~ dV, fue. 19, «--, ty, t) dV... Vy (2.6) 


represents the probability of finding the first particle in the volume element 
dV, for any distribution of the remaining particles of the system (the integra- 
tion being carried out with respect to the coordinates of the latter particles). 
It is obvious that the probability dW, given by farmula (2.6) must be identi- 





§2 THE WAVE FUNCTION 13 


cal with the definition (2.3). Analogous relations can also be written for other 
particles of the system. Thus dW gives the probability of finding a given 
configuration of the system in space. 

The normalization condition for the wave function of a system of N parti- 
cles has the form 


[UW ra -o fy, OI2dV do... Vy = 1 (2.7) 


It is clear that the wave function of a system of N particles is not normali- 
zed in the real three-dimensional space but in the 3/V-dimensional configura- 
tion space. 

In view of the similarity of principle between the wave function of one 
particle and that of a system of particles, we shall always denote the wave 
function by the symbol y. For the sake of brevity the whole set of coordi- 
nates is sometimes denoted by x. 

It follows from the aforesaid that the quantity ||? must be interpreted as 
a probability not in real space but in configuration space. At the same time, 
the introduction of the wave function of a system of particles confirms in a 
particularly obvious way the impossibility of interpreting the wave function 
as a quantity which describes a wave motion similar to an electromagnetic or 
an acoustic wave propagating in real space. Indeed, every wave motion in real 
space is characterized by the set of three variable coordinates and time. 
However, the wave function of a system of N particles depends on 3N coordi- 
nates and time. Hence in interpreting the -function as an ordinary wave one 
would have either to renounce the assumption of the unique meaning of the 
wave function of one microparticle and of a system of microparticles, or to 
introduce the hypothesis of the existence of a real multi-dimensional space. 
Both are in flagrant contradiction with all experimental data. 

Let us consider the important particular case of a system of noninteracting 
particles. Then finding the ith particle in the volume element dV;, the kth 
particle in the volume element dV; and so on, must be independent events. 
On the basis of the theorem of multiplication of probabilities, formula (2.5) 
can be written in this case as follows: 


dW = dW dW ... dWy = 
= |Y (rp O12dV lyar, DIAY «.. Wat, DIZAY y « 


This means that the wave function of a system of noninteracting particles is 
equal to 


V(r), Io, -o Ty, t) = y 11, t) W(t, t) g Valin, t) i (2.8) 








14 THE BASIC CONCEPTS Ch. 1 


The case where the system consists of identical microparticles (for exam- 
ple, of electrons or protons, and so on) will, in particular, be considered later 
(see §64). 

Before trying to construct the wave function for the simple case of the 
motion of a microparticle, it is necessary to make the following very impor- 
tant remark. At first sight it might be assumed that it is necessary to intro- 
duce into physics new notions in order to describe the states of microparti- 
cles, which represent completely new objects as regards their physical nature. 
It turns out, however, that this is not so. The state and the nature of the 
motion of microparticles can to a certain degree be characterized by the 
quantities and terms of classical physics. This is pointed out by the wave— 
particle duality of microparticles, which consists in the fact that in certain 
experiments microparticles manifest themselves as objects with a wave nature, 
while in other experiments they behave as ordinary corpuscles. 

When we introduced the statistical interpretation of the wave function we 
already assumed, to a certain degree, that the concepts of classical mechanics 
are applicable to microparticles. Indeed, the statement that ‘a microparticle 
can be found in a volume element dV” already implies the assumption that 
the classical approach is possible for the description of its state by defining its 
position in space. If the microparticle were in all respects similar to a wave, 
then the statement of the problem, ‘where can the microparticle be found’, 
would make no sense. On the other hand, the presence of the diffraction pat- 
tern makes it possible, under certain conditions, to associate the microparticle 
with another classical notion, a definite wavelength A, and to speak of the 
wavelength corresponding to the wave function of the particle. 

Classical ideas, such as the position of the particle, the wavelength and so 
on, can only be applied to microparticles within certain limits. We shall dwell 
in detail on this in §4. The most important thing is not the fact that in 
describing microparticles the notions of classical physics are of limited appli- 
cability, but the fact that they can and must be used in describing new objects 
which are so unlike ordinary macroscopic bodies or waves. 

We shall assume that the state of an electron moving freely in space can be 
characterized by an energy Æ and a momentum p. Then the relation between 
the energy and momentum is given by the classical formula 


_ IPI? 
B Pm ` (2.9) 
We assume that a beam of electrons which has passed through a strictly 
defined accelerating potential difference and has acquired a definite energy 
enters a diffraction arrangement (in practice this arrangement is usually a 


§2 THE WAVE FUNCTION 15 


crystal lattice). Formula (2.9) allows us to speak of a definite momentum of 
the electron. 

On the other hand, knowing the diffraction pattern one can (see §36 of 
Part IV) find the wavelength A corresponding to the electron. It turns out that 
the following relation exists between the quantities A and p: 


p= 2nh/N=hk. (2.10) 


The relation (2.10), which was first proposed in 1924 by de Broglie on the 
basis of theoretical considerations, is called de Broglie’s formula. The wave 
associated with the motion of a microparticle is called the de Broglie wave. 

We see that de Broglie’s formula is the same as formula (1.2) for light 
quanta. The frequencies corresponding to the de Broglie waves cannot be 
determined directly in an experimental way. However, it is natural to assume 
that the relation between the energy and frequency which holds for light 
quanta is also applicable to the de Broglie waves, i.e. 


E=hw. (2.11) 


Based on the relations (2.10) and (2.11), we write the wave function of a 
free particle in the form of a plane monochromatic wave 


= Aei(k-r-wd) = geli/n)(p-r-En 
Ypi, t)=Ae wi) = AeWN)p (2.12) 
Later it will be explained why y(r, t) should be written in the form of an 
exponential function, and not in the form of a sine or cosine (see §6). The 


constant A is determined by the normalization condition (see §26). 
By definition k is the wave vector 


l 
nP : = (2.13) 


By means of formulae (2.9), (2.10) and (2.11) one can find the dispersion 
law of de Broglie waves i 


Bail Se pA Ak 
iim Omh wane Cs) 


The corresponding phase velocity and group velocity are equal to 


w _hk 
Wan Fe = m (2.15) 
_dw fk _ 1 
Us = Ge =m TWP: . (2.16) 


è 


16 THE BASIC CONCEPTS Ch. 1 


Formula (2.16) shows that the group velocity of de Broglie waves is the 
same as the ordinary velocity of macroscopic particles. If we took the quanti- 
ty v,, as the initial expression for the velocity of the particle, then the rela- 
tion between the energy and frequency (2.11) could be obtained as a conse- 
quence of this definition. The phase velocity vpn of de Broglie waves has no 
direct physical meaning. This becomes particularly clear if one makes use of 
the relativistic expression for the relation of the energy of the momentum of 
the particle 


E = (p2c? + m2c4)3 = (2c?k? + m2c4)2 = hw. 
Then 


ORV AEN (9), Mace \? 
“ph k hk (c rey AC 


i.e. Upp is larger than the velocity of light. 


§3. The principle of superposition. Expansion in plane waves 


We have considered above the phenomenon of the diffraction of electrons 
incident on a screen with a limited number of apertures. In fact, however, 
what is observed is the diffraction of electrons from a crystal lattice. This 
case is not only of practical interest, but is also very important from the 
theoretical standpoint. 

We have seen in §36 of Part IV that the selective reflection of X-rays takes 
place when the Bragg—Woolf condition is fulfilled. If we dispense with unes- 
sential details, it turns out that the diffraction of electrons from a crystal 
lattice is analogous to the diffraction of X-rays. As a result of the scattering 
of electrons from a single-crystal a number of selective reflections occurs. 
Each selective reflection corresponds to a definite momentum or, by virtue of 
(2.10), to a definite wavelength. Thus the crystal lattice is a device which 
resolves the initial polychromatic beam of electrons into a number of beams 
each of which corresponds to electrons of a definite wavelength. 

We have already pointed out that the experiment with an electron beam is 
equivalent to the whole set of successive measurements with a large number 
of electrons which are in identical external conditions. Hence the diffraction 
grating plays the role of a device which analyzes the initial state of the micro- 
particle, resolving it into a set of individual states with definite values of the 
momentum. Since to each state with a definite momentum there corresponds 
a plane wave of the form (2.12), then consequently, the wave function 





§3 PRINCIPLE OF SUPERPOSITION 17 


describing the initial state of the electron incident on the grating can in gener- 
al be written in the form of a superposition of plane waves, i.e. 


co 


VO». z, = f cPx Py: Pz) Vp, Y, z, Odpydpydp,. (8-1) 


—co 


Physically this means that the wave function of an electron in an arbitrary 
state can be considered as a superposition of the wave functions correspond- 
ing to states with a definite value of the momentum. 

Hence it is not surprising that one electron (or other microparticle) can be 
ina definite state without having a definite value of the momentum. Although 
the concept of momentum, applied to the microparticle from classical mecha- 
nics, can be used in quantum mechanics, the state of a microparticle is not 
defined by the same laws as the state of a particle in classical mechanics. In 
later sections we shall come back to the discussion of the problem of charac- 
terizing the states of microparticles. 

Choosing the coefficient A of formula (2.12) in the form (see § 26) 


l 
(20h)? 





(3.2) 
we have 


l 
(rh): 





Vor, t) =f c(p) eGA (p:1-ED dp | (3.3) 


From the mathematical point of view, formula (3.3) represents the expansion 
of the function (r, t) in a Fourier integral. The amplitude c(p) shows the 
weight with which the state Vp is involved in the state described by the wave 
function W(r, t). According to the Parseval equality (see, in Volume 1, Appen- 
dix II eq. II.9), we have 


fivitav= fici2ap . 6.4) 


When choosing the coefficient A in the form of (3.2) the equality (3.4) 
contains no numerical coefficients. It is natural to assume that |c(p)|? can be 
related to the density p(p) of the probability that the value of the momentum 


of the particle in the state W(r, £) be equal to p. Namely, it is natural to 
assume that 


pp) = Ie(p)I? . (3.5) 


We write the equality sign and not the proportionality sign ~, because the 


-d p. Sa O e 


i 
f 





Pa ~ ——— SC 


18 THE BASIC CONCEPTS Ch. 1 


function W is normalized by the condition (2.2). In this case the following 
equality holds (see (3.4)): 

ficwyi2dp = 1. (3.6) 
We shall return to the discussion of formula (3.5) in §21. 

Equation (3.3) is a particular case of one of the most important proposi- 
tions of quantum mechanics — the principle of superposition. This principle 
amounts to the following: if the quantum system can be in states described 
by functions Y}, Wo, ..., Yp, then the linear combination (superposition) of 
the wave functions Y,„ 


Y =D cpVn > ei) 


where c,, are arbitrary constants, is also a wave function describing one of the 
possible states of the system. The importance of the principle of superposi- 
tion, in particular, lies in the fact that it restricts the possible equations for 
the determination of y to linear equations (see §6). If the index  charac- 
terizing the state runs over a continuous sequence of values, then the summa- 
tion in formula (3.7) must be replaced by integration. Later we shall come 
back to the discussion of the notion ‘the state of a quantum system’ and of 
the meaning of the coefficients c, (see §21 and §23). 

As an example of the application of the principle of superposition, let us 
consider a free particle whose momentum has no strictly defined value but 
can lie in a small interval Ap about the value pg, namely pp—Ap <p SpotAp. 
For simplicity we consider the case of uniform motion. According to (3.3) 
the wave function of the electron can be written in the form 





PotAp 
W(x, 1) = c(p) ON at do 
Po -AP ee 
n\} ko+Ak 
= (2) ° f doen m (3.8) 
ž ky~Ak 


For uniform motion the coefficient A is equal to (27A)? (see §26). In 
accordance with the results of §35 of Part I, the expansion of (3.8) is 
expressed by a formula which is the same as (35.1) of Part I: 


dw 
væ, O= È yp 2c'(ko RRA oe elox won (3.9) 


§4 UNCERTAINTY RELATIONS 19 


Formula (3.9) shows that the superposition of wave functions which cor- 
respond to nearby values of the momenta (or the wave numbers) leads to the 
formation of a wave packet propagating with the group velocity 


ee ee eee 
gr dk Jo m0- 


It is clear from the form of the wave function (3.9) that the probability of 
finding the microparticle at a point x at an instant of time ¢, which is propor- 
tional to |W(x, £)|?, has a sharp maximum moving with the velocity Upr- 

It should be stressed that eq. (3.9) is of an approximate character, Taking 
account of the subsequent terms of the expansion of the function w(k) would 
lead to an expression for the wave packet whose width would increase with 
time. Such a wave packet is said to be spreading. The spread of the packet 
follows directly from the fact that each wave forming the packet moves with 
its phase velocity Vpn = w/k = hk/2m. 

When quantum mechanics first appeared attempts were made to identify 
the electron with a wave packet made up of de Broglie waves. However, the 
spread of the wave packet is indicative of the unsoundness of such a treat- 
ment. Furthermore, if the electron represented a wave packet, then, as in the 
case of a single wave, it would be impossible to account for the experiment on 
the diffraction of individual electrons. 


§4. Uncertainty relations and the relationship between quantum mechanics 
and classical mechanics 


We shall use the representation of the wave function in the form of a wave 
packet for the discussion of a fundamental problem. The question is to what 
extent and with what degree of accuracy use can be made of the concepts of 
classical mechanics in their application to microparticles. Here we restrict 
ourselves to the consideration of the concepts of the momentum and position 
of a particle in space. In $24 this problem will be studied in full. 

We have seen in §35 of Part I that the wave packet possesses a spatial 
extent given by formula (35.7) of Part I. Applying this to the wave packet 


(3.9), in which we are interested here, this formula can be written in the 
form 


Ap, Ax~h. 


Since a spread of the packet takes place, which was not taken into account 





20 THE BASIC CONCEPTS Chel 


in deriving this formula, it can be written more correctly as follows: 
Ap, Ax>n. (4.1) 


The numerical factor in formula (4.1) will be defined more precisely in § 24. 
In a similar way one can write the relations for the remaining two coordinate 
and momentum components: 


ApyAy Zñ, (4.2) 


Ap,Az>fi. (4.3) 


Formulae (4.1)—(4.3) are called Heisenberg’s uncertainty relations. We 
discuss the meaning of these inequalities by proceeding from the probabilis- 
tic interpretation of the wave function. If the width of the wave packet is 
equal to Ax, then according to what was said in the preceding section the 
measurements of the coordinates of the electron will show that with a very 
high probability it will be found in the region of space Ax. In this sense it can 
be said that the coordinate of the electron is determined to within an accura- 
cy of Ax. However, the electron found in the region Ax is not described by a 
plane wave and has no definite value of the momentum. To form a wave 
packet of width Ax it was necessary to form a superposition of plane waves 
with momenta in the interval pp—Ap, <p, <potAp,, where Ap, is deter- 
mined by formula (4.1). This means that the measurements of the momentum 
of an electron localized in the region Ax will lead to values of the momentum 
which lie in the interval mentioned. In other words, the uncertainty Ax in the 
value of the coordinate of an electron localized in the region Ax and the 
uncertainty Ap, in the value of its momentum are connected by the relation 
(4.1). The smaller the width of the packet Ax, the larger Ap,. Conversely, if 
the momentum interval Ap, is defined, then formula (4.1) shows that the 
particle will be found with a very high probability in a region of space 
Ax >h/Ap,.. 

It follows from the inequality (4.1) that the values of Ax and Ap, cannot 
simultaneously be equal to zero. This means that the x-coordinate and the 
momentum p, associated with it cannot simultaneously have sharp values. 
Thus the classical concepts of the spatial position and the value of the 
momentum are applicable to microparticles only within definite limits given 
by Heisenberg’s relations. Any attempt to apply simultaneously the concepts 
of the momentum and the coordinate to a microparticle with an accuracy 
higher than that given by the uncertainty relations makes no sense. This fact 
is associated with the very nature of microparticles, with their wave— 
corpuscle properties. 


§4 UNCERTAINTY RELATIONS 21 


In this connection the reader should be warned against the erroneous 
assumption of certain authors that Heisenberg’s uncertainty relations give 
that degree of accuracy with which the coordinates and momentum of micro- 
particles can be determined within the framework of quantum mechanics. 
In their opinion, for a more accurate simultaneous determination of the coor- 
dinates and momenta a further development of theory is necessary. 

In reality this is not so. Microparticles are completely new objects, by no 
means classical, with their characteristic properties and laws of motion. As we 
have already pointed out, a distinctive feature of microparticles is the dualism 
of the corpuscular and wave properties which they manifest. It follows from 
diffraction experiments that particles have no definite trajectory. Hence it is 
impossible to describe the motion of a particle by giving an accurate value of 
the coordinate and momentum at every instant of time, as is done in classical 
mechanics. However, one can indicate, with a certain degree of accuracy, the 
magnitude of that region of space in which the particle will be found with a 
very high probability, and the interval of those values of the momentum 
which it possesses at that time. The value of these quantities is given by 
Heisenberg’s uncertainty relations. 

It should be noted that when the particle has a definite value of the 
momentum, Ap, = 0, then according to (4.1) its position is completely indefi- 
nite, i.e. Ax > °°. Indeed, a state with definite momentum is described by a 
plane de Broglie wave. For such a wave the square of the modulus IW pl? is 
constant, i.e. the particle can be found with the same probability at any 
point of space. 

On the other hand, if a definite position of the particle at a given instant 
of time is given, then its momentum is completely indefinite. It may seem 
that the relation obtained is in contradiction with the existence of distinct 
tracks of particles in a cloud chamber or on a photographic plate. However, 
this contradiction is only apparent. Indeed, the track of an electron in a cloud 
chamber represents liquid drops formed on the ions produced by the electron. 
The size of the drops gives the degree of accuracy with which the coordinate 
of the particle can be fixed. Since the size of the drops is of the order 
of 10-4 cm, the uncertainty in the coordinate of the electron is also of the 
order of 10-4 cm. Consequently, the uncertainty in the corresponding 
momentum component Ap, ~7i/Ax ~ 10-23 g-cm-sec!. Since the mass 
of the electron is equal to ~10-27 g, then the uncertainty in the velocity 
component perpendicular to the track will be equal to Av, =m —lAp,. = 104 
em: sec~!, 

But tracks in a cloud chamber are produced only by fast electrons having a 
velocity v of the order’ of >10? cm/sec. Hence we see that under these condi- 





22 THE BASIC CONCEPTS Ch. 1 


tions Av, <v and one can speak approximately of the motion of the particle 
along a trajectory in the cloud chamber. 

Heisenberg’s uncertainty relation written in the form 

Av, Ax =e (4.4) 
shows that the concepts of classical physics turn out to be applicable with a 
degree of accuracy which is higher, the larger the mass of the particle. In view 
of the smallness of the quantum constant #2, the uncertainty in the values of 
the coordinate and velocity become negligibly small for particles of a macros- 
copically small but still not atomic size. 

Let us, for example, have a body of the size of about 1 micron and with a 
mass of only 10-19 g. Then (4.4) gives Av, Ax~107!7 em?-sec!. If, for 
example, the position of the body is determined with an accuracy of 1076 cm 
(1/100 of its size), then Av,~107H cm:sec—!. The velocity of the Brownian 
motion of a particle of a mass of 10-19 g amounts to ~10~4 cm-sec~!. We 
see that the error in the velocity, which is associated with the uncertainty‘rela- 
tion, is already negligibly small for such a small body. Even more so, is it of 
no importance for macroscopic bodies. 

The estimates given illustrate a general important proposition of quantum 
mechanics which is called the correspondence principle: in passing to the 
limit A > 0, i.e. in assuming that the effects proportional to the quantum 
constant can be disregarded, the laws and relations of quantum mechanics go 
over into the corresponding laws and relations of classical mechanics. (For 
more detail on the transition to classical mechanics see ch. 5.) In particular, 
for particles of large mass the ratio A/m is so small that in practice the coordi- 
nate and velocity have definite values. Such a particle has a trajectory along 
which it moves in accordance with the laws of classical mechanics. The impor- 
tance of the correspondence principle lies in the fact that it serves as a method 
of finding the quantum-mechanical analogues of classical quantities. Quantum 
mechanics implies classical mechanics as a certain limiting case corresponding 
to A > 0 (for other conditions of this transition see ch. 5). From the corres- 
pondence principle it is possible to establish the relation between certain 
quantum-mechanical quantities and the concepts of classical mechanics. 

Besides the reasoning presented above, the uncertainty relations are often 
obtained from a discussion of the possible degree of accuracy of the determi- 
nation of the coordinate and momentum of a microparticle in different exper- 
iments which are in principle feasible. We shall not dwell on the analysis of 
these examples, because a strict derivation of the uncertainty relations will be 
given in §24. 


un 
wn 


THE CAUSALITY PRINCIPLE 23 


It should be noted that if the region of possible motion of a microparticle 
is given, for example the size / of the atom or the nucleus, then the uncertain- 
ty relations make it possible to qualitatively estimate the values of its momen- 
tum and energy. Indeed, the absolute value of the momentum is of the same 
order of magnitude as its uncertainty Ap ~ h/l. Consequently, p > ħ/l, and 
the energy of the particle is 


pe ate. 
2m ~ 2m1? ` 


(4.5) 


We see that the energy increases with decreasing region of localization. For 
example, for an electron in the atom / is of the order of the size of the atom, 
ie. of the order of 1078 cm. Substituting this value into (4.5), we find the 
energy of the electron in the atom to be £ > 10 eV. This is of the correct 
order of magnitude. 

Further, let us consider a proton or a neutron in a nucleus. The size of the 
nucleus is of the order of 10~!2 cm. Setting /~10~!2 cm and taking into 
account that the mass of the nucleon is m~10~24 g, we estimate the energy 
E to be E> | MeV. This estimate is also in agreement with experimental data. 


§5. The principle of causality in quantum mechanics 


We have seen in the preceding section that the concepts of classical physics 
are applicable to microparticles only within certain limits. The question natu- 
rally arises: why can and must we, as a matter of fact, describe the motion of 
microparticles in terms of classical physics? The necessity of introducing 
classical concepts into quantum mechanics is associated with the following 
important fact: the explanation of the properties and laws of motion of 
micro-objects is possible only by setting them in interaction with macros- 
copic bodies. A macroscopic body interacting with microparticles is called an 
apparatus. The process of interaction between the apparatus and a microparti- 
cle is called measurement. 

Of course, the apparatus in this sense of the word is not necessarily an 
artificial device for registering the properties of microparticles. The apparatus 
is any body which can change its state as a result of an interaction with 
micro-objects and which is described with a sufficient degree of accuracy by 
the laws of classical physics. The process of interaction of the apparatus with 
a microparticle (a measurement) is an objective process taking place in space 
and time. However, since any scientific information can only be based on the 
fact and character of the interaction mentioned, all characteristics of micro- 





24 THE BASIC CONCEPTS Cin Bt 


particles must be directly connected with the properties of their interaction 
with macroscopic bodies. This just means that the description of microparti- 
cles must necessarily imply, if only partially, the concepts of classical physics. 
Of course, there may also exist characteristics and properties of micropar- 
ticles which manifest themselves in interactions with apparatus but have 
no classical analogue. We shall see, for example in ch. 8, that such a character- 
istic is the spin of microparticles. 

The interaction between microparticles and macroscopic bodies differs, 
essentially of course, from the interaction of macroscopic bodies with each 
other. Namely, for the interaction between two macroscopic bodies, one of 
which is playing the role of the apparatus, one can always assume that the 
reaction of the apparatus on the body is as small as one wishes or, if one likes, 
one can take it accurately into account. Hence it is said that the effect of the 
apparatus does not change the state of the macroscopic object. 

The situation is different in the case of the interaction between physical 
objects of different natures, i.e. a microparticle and a macroscopic body (the 
apparatus). Here it is, in principle, impossible to assume that the effect of the 
apparatus on the microparticle is small and unimportant. Let us consider a 
simple example. We assume that electrons are let successively through a slit in 
a screen. The screen with the slit is a macroscopic body (the apparatus) which 
measures the y-coordinate of an electron with an accuracy Ay, where Ay is 
the width of the slit. 

The state of all the electrons before the interaction was the same. Let, for 
example, electrons with a definite direction of the momentum p (e.g. along 
the x-axis) be incident on the apparatus. Here p, = 0. The state of the appa- 
ratus before the interaction is also defined, but in a macroscopic way. In the 
process of interaction of the apparatus with the electron the latter is localized 
in the region Ay defined by the size of the opening in the screen. Then the 
state of the electron essentially changes. The electron passes from a state with 
the definite momentum component p, = 0 to a state in which the momentum 
component p,, has a value lying in the interval Ap, ~h/Ay. Indeed, as we 
know, diffraction occurs when electrons pass through a slit and the electrons 
get a momentum component along the y-axis. If we let electrons successively 
through the slit and measure the values of their momentum component Py, 
then we shall obtain all possible values of p, which lie in the interval Apy. 

-Thus we see that the effect of the apparatus on the electron changes the 
state of the latter and in principle cannot be made small. Although before 
the measurement the micro-object and the apparatus were in a definite state, 
the result of the interaction with the apparatus is not single-valued: we obtain 
a state with an indefinite value of the momentum component p,,. We can only 
find the probability of any one value of this quantity. 


§5 THE CAUSALITY PRINCIPLE 25 


As a result of carrying out a successive series of measurements, no matter 
how large, we would not obtain a more accurate value of p,, but only a more 
accurate expression for the distribution of the probabilities of different values 
of this quantity. If the micro-object were in given external conditions, then 
it would nevertheless be impossible to predict accurately the result of the 
measurement. One can speak only of the probability distribution of the 
results of the measurements. This is not associated with any shortcomings of 
the theory but with the very nature of microparticles. Hence it follows that 
the principle of mechanical determinism does not characterize the properties 
of microparticles, 

A given initial value of a certain quantity and a definite law of interaction 
do not unambiguously determine the measured value of this quantity for the 
microparticle at subsequent instants of time. Thus the behaviour of an indivi- 
dual microparticle, and not just the behaviour of a set of microparticles, is 
determined by laws of a statistical type. The law of causality for a microparti- 
cle takes the following character. Let the state of a particle be known at the 
initial instant of time t=O. This means that its wave function W(r, 0) is 
known. If all interactions which the microparticle undergoes are known, then, 
as we shall see below (see §6), its wave function can be determined unambig- 
uously at subsequent instants of time ¢> 0. It follows from the meaning of 
the wave function that we can by this predict the probabilities (see §21) that 
the quantities characterizing the particle (the coordinate, momentum, energy 
and others) will have particular values at any instant of time r> 0. 

The principle of causality formulated in quantum mechanics in this way is 
of considerably more general character than the dynamical regularity (the 
Laplace determinism) of classical mechanics*. 


* The reader may find in the work of V.A. Fok a more detailed consideration of the 
problems which are touched upon in this section. The interpretation of quantum mecha- 
nics may be found in the collection of papers Filozoficheskie voprosy sovremenoi fiziki 
(Philosophic problems of contemporary physics) published by the Academy of Sciences 


of the USSR in 1959. See also N. Bohr, Atomic physics and human knowledge (Wiley, 
New York, 1958). 











The Schrodinger Equation 


§6. The Schrodinger wave equation 


In §2 we established the form of the wave function describing the motion 
of a free particle with a given value of the momentum. This wave function 
had the form of a plane de Broglie wave. We now turn to the consideration of 
the motion of particles in external fields of force. For this it is necessary to 
find the wave function describing the motion of a particle in a given field of 
force. It turns out that it is possible to establish the form of the differential 
equation satisfied by the wave function. One can find the wave function itself 
from the solution of this equation. It should be noted, first of all, that the 
equation for the wave function must be linear. Indeed, functions satisfying 
non-linear equations obviously do not meet the requirements of the principle 
of superposition. Further, it is clear that the wave function we already know, 
which describes the motion of a free particle, must be the solution of the 
required differential equation in the particular case where the field is absent. 
Finding the linear differential equation satisfied by a plane de Broglie wave 


W(x, i) = Ae (pr-E | (6.1) 


26 





un 
a 


THE SCHRODINGER EQUATION 27 


presents no difficulty. For this we note that 





ove! 
atti 
Furthermore, 
2y 2y 2y 
ay 37y 3Y L (p2+p2+p2)v. 


əx? ðy? z2 hi 

Taking into account that for a free particle 
Dee 5 

De eps tip; E 69 

——— a 6.2 

2m (6.2) 


we find 





oy _ ih (eck ee 
at 2m \dx2 ay2 22) 


This equation is usually written in the form 


2 
pasts SZ yf 
ii Vow. (6.3) 


This linear differential equation in partial derivatives is called the Schrodin- 
ger equation. It does not contain any characteristics of the state of the parti- 
cle, for example the value of its momentum or energy. It involves only the 
mass of the particle as well as the universal constant #. Equation (6.3) is evi- 
dently satisfied not only by a wave function of the form (6.1), which repre- 
sents the wave function of a particle with a given value of the momentum, but 
also by any superposition of such wave functions. 

The Schrodinger equation possesses the feature that it is an equation of the 
first order in time and contains the factor i. The latter means that the wave 
function must be complex. 

We note that it seems that a function expressed by a real relation, for 
example in the form of a travelling wave yY =A cosh~\(p-r—Er), could be 
chosen as the wave function of a free particle. However, we could not then 
construct any equation of the first order in time whose solution would be an 
arbitrary superposition of such functions. The fact that the Schrodinger equa- 
tion contains only the first derivative of the wave function with respect to 
time is closely associated with the expression of the principle of causality in 
quantum mechanics (see §5). Indeed, if the Schrédinger equation contained, 
for example, the second derivative of the wave function with respect to time, 
then the knowledge of the wave function at the initial instant of time would 





28 THE SCHRODINGER EQUATION Ch. 2 


be insufficient for determining the wave function at an arbitrary instant of 
time ż. Namely, it would also be necessary to give the value of the first deriva- 
tive of the wave function with respect to time for the initial instant of time. 

Among the solutions of eq. (6.3) there are solutions depending harmonic- 
ally on the time 


W(x, 1) = Wx) e “WEF | (6.4) 


Substituting (6.4) into (6.3), we obtain an equation for a function which 
depends only on the coordinates of the particle 





VOU Ce Va) A. z (6.5) 
3 1 


h2 

This equation defines the function W(x) for a free particle. Let us generalize 
eq. (6.5) to the case of a particle moving in a field of force. This generaliza- 
tion is based on the following assumption: the energy Æ involved in eq. (6.5) 
represents the kinetic energy of the particle. Indeed, for free motion the kine- 
tic energy is the same as the total energy. If in the required generalization the 
energy E occurring in eq. (6.5) is assumed to be the total energy, then the 
wave function describing the motion of electrons in a field of force will not 
depend on the forces acting on the particle. However, this would make no 
sense. Thus we arrive at the conclusion that £ in eq. (6.5) must be understood 
to be the kinetic energy of the particle. Denoting the potential energy of the 
particle by U(x), and the total energy by E£, we get 


Vye) + [E-U] Va) =0. (6.6) 
i ’ 

Equation (6.6) represents the required generalization of the Schrédinger 
equation to the case of a particle moving in an arbitrary potential field which 
does not depend on time. This equation only determines the dependence of 
the wave function on the coordinates, while the dependence on time is deter- 
mined as before by the relation (6.4). 

Equation (6.6) is called the Schrédinger equation for stationary states. 
Indeed, the probability density of the measurement of the coordinates of a 
particle in a state (6.4) does not depend on the time 


IW, D? = IW, 012. (6.7) 


In §28 it will be shown that the probabilities of the measurement of other 
physical quantities in a state (6.4) also do not depend on the time. 
Substituting the derivative with respect to time dW/dr for the quantity Ey 


§6 THE SCHRODINGER EQUATION 29 


by means of (6.4), we arrive at the general Schrodinger wave equation 


? 
not _ 2 vay + uy, (6.8) 
where the wave function y depends on the coordinates x, y, z and the time f¢. 

Equation (6.8) is the fundamental equation of quantum mechanics. It 
plays in quantum mechanics the same role as Newton’s equations in classical 
mechanics, and could be called the equation of motion of a quantum particle. 
To define the law of motion of a particle in quantum mechanics means to 
define the value of the w-function at every instant of time and at every point 
of space. 

It should be noted that the above reasoning is not a derivation of the 
Schrödinger equation in the strict sense of the word. Like Newton’s and 
Maxwell’s equations, the Schrodinger equation appeared, on the one hand, as 
the generalization of known experimental data and, on the other hand, as a 
great scientific prediction. 

We shall see later how the discreteness of energy levels follows from the 
Schrödinger equation. Also it will become clear that the Schrodinger equation 
satisfies the correspondence principle. The validity of the Schrödinger equa- 
tion and the interpretation of the meaning of the wave function involved in it 
are confirmed by a vast amount of experimental data from contemporary 
atomic and nuclear physics. In order to obtain the law of motion of a particle, 
the wave function W(x, £), the initial and boundary conditions must be given 
in addition to the Schrödinger equation. Since the Schrodinger equation is an 
equation of the first order in time, it is necessary to know the initial value of 
the wave function W(x, 0). 

The set of boundary conditions in general amounts to the requirement of 
single-valuedness and continuity of the wave function and its first derivatives, 
and to the fulfillment of certain normalization conditions. The latter is usual- 
ly a bounding condition on the modulus of the wave function. The whole set 
of the initial condition and the conditions of single-valuedness, continuity 
and finiteness of the wave function and of its first derivatives makes it possi- 
ble, in principle, to find a unique solution of the Schrédinger equation: the 
wave function W(x, t). In other words, if the initial value of the wave function 
is given, then from the solution of the Schrödinger equation one can deter- 
mine unambiguously the state of the quantum system for.any subsequent 
instant of time f > 0. Namely, for ¢ > 0 one can find the wave function of the 
system W(x, ¢). i 

We shall see in §23 that by defining W(x, t) the quantum particle is charac- 
terized as fully as, for example, the particle in classical mechanics is by defin- 


———— ————— —— —_— a 


30 THE SCHRODINGER EQUATION Ch 2 


ing its trajectory. We note in addition that in certain problems of quantum 
mechanics it is convenient to approximate the potential energy by a discon- 
tinuous function. At the point of discontinuity of the potential energy the 
wave function and its first derivatives must remain continuous. The derivative 
of the wave function undergoes a jump only at the surface of an infinitely 
large discontinuity of the potential energy. 

The Schrödinger equation, as well as the equations of motion of classical 
mechanics, allows ‘reversal in time’. 

Indeed, eq. (6.8) does not change when the transformation ¢ > —¢ and the 
transition to the complex conjugate function y* are made. Consequently, a 
process reversed in time is described by the wave function W,o(x, t) 


Ve, = W(x, — 2). (6.9) 


We note that for motion in a magnetic field reversal in time takes place only 
when the direction of the magnetic field is also reversed (see §27). We shall 
consider the problem of time reversal in more detail in §98. 


§7. The probability-current density 


Generally speaking the wave function describing the motion of a particle 
changes in space and time. However, this.change cannot be arbitrary. 

That is, a conservation law holds. To formulate this law, we consider the 
integral fylY|?dV, which represents the probability of finding the particle in 
the volume V. Proceeding in the same way as in deriving the law of charge 
conservation (see §5 of Part I), we find the derivative of the above integral 
with respect to time. To calculate aW/dt and dW*/dr we make use of the 
Schrodinger equation (6.8) and its conjugate equation. Then we obtain 


a x oy a , ow" 
= bw = Ss ijj —— = 
z J yay MEA tes a 


= fwvy'—v'v2y)av= 
h 
= sar JV OVV =v WH) ay. (7.1) 


Making use of the Gauss-Ostrogradsky theorem, we have 
{va -vvuyav= fwvy'—v'vwyas, 
y S 





§7 l PROBABILITY CURRENT DENSITY 31 


where the surface S bounds the volume V. Hence 
ð 2 h * * 
BOR adi; = YVW*—w*V 2 
yi fw d} Ai fow W'Vy)ds . (7.2) 


We introducethe vector j defined by the relation 


h 
2mi 


j=; ("Vy -Wvw"). (7.3) 


Then (7.2) is rewritten in the form 
ð A 
=e eS (7.4) 


Formula (7.4) shows that the probability density satisfies a conservation law, 
and the vector j we introduced has the meaning of the probability current 
density. Relation (7.4) can be rewritten in differential form as the continuity 
equation 





+V-j=0. (7.5) 


The integral of the normal component of the vector j with respect to a parti- 
cular surface represents the probability that the particle will cross the surface 
mentioned in unit time. 

Let us consider, in particular, free motion. We take the wave function in 
the form of the plane wave y = AeW/")(Pt-£0_ Making use of the relation 
(7.3), we obtain à 


geal 2 
j=; PIA. (7.6) 

We now apply the relation (7.4) to all space, i.e. we assume the surface S 
to be infinitely distant. If y is a quadratically integrable function, then the 
integrand in the integral with respect to the surface decreases more rapidly 
than 774, and the surface of integration increases in proportion to r2. As a 
result the integral over the surface in.(7.4) reduces to zero. But if Y does not 
tend to zero in the way mentioned when r > %, as, for example, in the case of 
a plane wave, then there is a current of particles at infinity. If this current is 
stationary, then the wave function can be normalized in such a way that the 
vector j is the particle-current density vector. 

Finally, we note from formula (7.3) that the current density j obviously 





32 THE SCHRODINGER EQUATION Ch. 2 


reduces to zero if the state of the system is described by a real wave 


function 4. 
The relation (7.5) written in the form 


dp oe 

ə 3 Oo: (7.7) 
can be interpreted as the law of conservation of the number of particles (see 
§5 of Part I). 


§8. A particle in a one-dimensional rectangular potential well 


Before turning to the consideration of real atomic systems we shall discuss 
the general properties of the solutions of the Schrödinger equation by the use 
of some simple models. Let us consider first of all the one-dimensional motion 
of a particle in a potential field defined as follows: 


for O<x<l, 


U(x) = 
œ for x<O and xl. 


We shall call such a potential field an infinitely deep potential well. It is clear 
that in such a well a particle can move only in the region of space 0 <x < /, 

At the boundary of the well the particle is acted upon by arbitrarily large 
forces which prevent it from getting out, so that the particle behaves as if 
confined to a region of space bounded by perfectly reflecting walls. It turns 
out that by such a simple example one can establish a number of properties 
of quantum-mechanical systems. It is important that these properties are not 
associated with the model but are of a general character. Furthermore, inter- 
est in this problem is also due to the fact that a model of a potential well is 
often successfully used for a rough description of a number of systems, for 
example electrons in a metal or nucleons in a nucleus. 

The solution of the Schrödinger equation must be written for two regions: 
outside the potential well and inside it. Since the particle cannot be outside 
the potential well, its wave function is equal to zero outside the interval 
0 <x </. From the condition of continuity it follows that the wave function 
is also equal to zero at the points x = 0 and x =/, i.e. that 


YO)=Y)=0. (8.1) 


The requirement (8.1) is the boundary condition for the solution of the 
Schrodinger equation inside the potential well. In the region O <x </ the 


§8 ONE-DIMENSIONAL RECTANGULAR POTENTIAL WELL 33 
Schrédinger equation for stationary states (6.6) is of the form 
= Se Ey (8.2) 


The solution of this equation can evidently be written as 
W=A sin (kx +a), (8.3) 


where k= (2mE/h2):. We now make use of the boundary conditions (8.1). 
From the relation y = 0 for x = 0 it follows that a = 0. The condition W(/) = 0 
gives 


kl=nnt, ; (8.4) 


where n is any integer larger than zero. For n = 0 we would have y = 0, which 
would mean the absence of the particle in all space. The condition (8.4) 
makes it possible to find the possible values of the energy of the particle 
222 
i EE (8.5) 
2ml- 
We see that the Schrodinger equation has solutions satisfying the boundary 
conditions only for discrete values of the quantum number 7. Thus the energy 
of the particle in the infinitely deep potential well turns out to be quantized. 
The discreteness of the energy arose in a natural way, without any subsidiary 
assumptions. In the case given it turned out to follow directly from the 
boundary conditions imposed upon the wave function at the limits of the 
integration range. The state of the particle which has the lowest possible ener- 
gy will henceforth be called the normal or ground state, while all other states 
will be called excited states. The energy of the ground state of a particle in an 
infinitely deep potential well is obtained from formula (8.5) for n = 1: 
27,2 
Bae a- A (8.6) 
2ml- 
We note that the minimum energy value of the particle is consistent with the 
uncertainty principle. Indeed, the uncertainty in the coordinate of the parti- 
cle is Ax ~ /. The uncertainty in the momentum, Ap, is of the order of hi/l. 
Since p 2 Ap, the minimum energy of the particle turns out to be equal to 
p2 h2 


2m ~ 2ml? 


which is in order of magnitude the same as (8.6). , 
Let us now determine the spacing between neighbouring energy levels 











> 





ee 


34 THE SCHRODINGER EQUATION Ch. 2 
(An = 1): 
272 
mh 
AE, = Enti -En = Smi? @n +1). 


The spacing between levels increases with decreasing mass of the particle and 
size of the region of its motion /. Thus, for example, for an electron 
(m ~ 10-27 g) confined in a region /~ 5X10~8 cm we find AE ~ 1 eV. On 
the contrary, in the case of a molecule with m ~ 10723 g moving, for exam- 
ple, in a region / ~ 10 cm, the spacing between levels amounts to AE ~ 10720 
eV. This spacing is so small, for example in comparison with kT = 0.025 eV, 
that in practice the energy of a molecule can be considered to be a contin- 
uously varying quantity. 

Let us find the ratio AE,,/E,,, i. the relative spacing between energy 
levels. We see that AB JB, m nm! and tends to zero for very large n. The 
discreteness of quantum states is no longer significant for large quantum 
numbers and in fact a transition to a quasi-continuous variation of the energy 
takes place. 

Let us consider in somewhat more detail the properties of the wave func- 
tion of a particle in a potential well. The wave function corresponding to the 
nth energy level is of the form 


Yn =A, sin x. (8.7) 


We define the constant A,, by the normalization condition 
1 
finds. 
0 
Then 


1 1 
2 S 22nn = 2 1 2nt j an 2 IE 
1A, 1 en T xdx = |A,]| J 7 1—cos 7% dx = |A,| a= Ve 


Hence 
A, = (2/3 . `- (8.8) 


Thus the value of the constant does not depend on the quantum number 7: 

The probability density ||? of finding the particle at different points 
inside the well is illustrated in fig. V.2. In classical mechanics a particle 
moving in a potential well can be found with equal probability at any point 


§8 ONE-DIMENSIONAL RECTANGULAR POTENTIAL WELL 35 


4 
px)? 














Fig. V.2 


inside the well (straight line in fig. V.2). Indeed, the probability dW... of 
observing the particle in an interval dx is proportional to the time dr of the 
particle’s being in this interval: 


1 
} ~ =— 
dW atacs ~ Ae > ox A 


Since a particle inside a well is not acted upon by any forces it moves with 
constant velocity v and, consequently, dW... does not depend on x. As the 
quantum number n (the energy of the particle) increases the maxima of the 
probability distribution tend to approach each other. In the limit n > the 
probability distribution obtained from the quantum-mechanical calculation 
leads to the same results as the classical distribution. This follows from the 
fact that the function sin? (mmx/J) rapidly oscillates as x changes, and in inte- 
gration over any finite interval can be replaced by 4. Thus consideration of 
the simplest quantum-mechanical system leads us to the following conclusions 
which, as we shall see later, are of a general character: 

(1) the energy of a microparticle moving in a potential well runs over a dis- 
crete sequence of values; 

(2) even for E =E} (ground state) the particle is not in a state of complete 
rest with kinetic energy equal to zero; 

(3) the discrete character of the energy levels manifests itself when the mass 
of the particle and the size of the region in which the motion takes place are 
small; 

(4) for large values of the quantum numbers the quantum-mechanical rela- 
tions go over into the formulae of classical physics. This statement is a parti- 
cular case of the correspondence principle which we shall frequently 
encounter. 


Later, in considering the quantum oscillator or atomic systems, we shall 





36 THE SCHRODINGER EQUATION Chee) 


see that quantization of states can take place even in systems which are not 
confined by impenetrable walls. At the same time we shall see that the pres- 
ence of discrete energy states is not a necessity for quantum-mechanical sys- 
tems. In certain cases quantum-mechanical systems have a continuous energy 
spectrum. 


§9. A particle in a three-dimensional rectangular potential well 


Let us now consider the more complex case of the motion of a particle in 
a three-dimensional infinitely deep potential well. We shall assume that the 
region of space in which the particle moves is defined by the inequalities 
0<x</,, 0<y <l and 0<z <l}. In this case the wave equation can be 
written in the form 
-E (ay Se) =EẸyŅ. (9.1) 
2m \dx2 ay? az? 
The boundary conditions are analogous to (8.1) and have the form 


VO, y, 2) = W(x, 0, 2) = Wx, y, 0) = 


= WI}, ¥, 2) = W(X, Ly, 2) = Yœ, y, 15) =0. (9.2) 
We write the solution of eq. (9.1) as follows: 
V =B sin kyx sin kay sin k3z . men (9:3) 
Substituting Y into the equation, we obtain the relation 
h2 Din PP) DN = 
om (ki +ko+k3)=E£. (9.4) 


From the boundary conditions (9.2) it follows that 
kKyly=nyr, Koln Eng, k3l3 = n3T , (9.5) 
where 7}, 73 and n3 are integers. 
Substituting the values of ky, k and k3 into (9.4) and (9.3), we obtain the 
expressions for the energy and for the wave function 
2,2 2 2 2 
h (nj nz n 
(Gao) (CO) 
Yi b&b E 


T 


E a 
ny. na, n3 
‘ 2m 


7M x TMV TN 32 


=B sin —— sin —— sin 


l —— 
Yayngn3 TF IF iB (9.7) 


§10 THE QUANTUM-MECHANICAL OSCILLATOR 37 
The constant B is again defined by the normalization condition 


2 =s 
apap Êd dy dz = 1 
V 


and is equal to 


Ee Wes 
= Ç 
B m) : (9.8) 


Let us consider, in particular, a particle moving in a potential well of cubic 
form, i.e. the case /; = l3 =/, = l. The energy of the particle is equal to 





27,2 

Fintan ot nr (nt +n} +n). (9.9) 
From formula (9.9) it is easily seen that one and the same energy value can be 
realized by means of different combinations of the numbers n}, ng and 73. 
This means that several different quantum states with different wave func- 
tions correspond to one and the same energy value. Such energy levels are 
said to be degenerate, and the number of different states corresponding to a 
given energy level is called the multiplicity of degeneracy. 

Let us consider, for example, the energy level 


E= mh p 
2ml? 
where n? + na + nk = 6. Since each of the n’s is an integer larger than zero, 


this equality can be satisfied by the three different combinations of the 
numbers 1), 19,13 





6 


? 


(1) nmy=2, nm=1, n3=1, 
(Q) Sy S25 A= 
(3) nmy=1, nmQ=1, n3z=2. 


Thus to the given energy level there correspond 3 different states W211, W121 
and W112. Consequently, the multiplicity of degeneracy is equal to three. In 
considering more complex systems, for example atoms, we shall frequently 
encounter the phenomenon of degeneracy. | 


§ 10. The quantum-mechanical oscillator 


Turning to more complex quantum-mechanical systems, we shall consider 





38 THE SCHRODINGER EQUATION Ch. 2 


- the theory of the linear harmonic oscillator. Such an oscillator represents the 
quantum analogue of a particle performing small linear oscillations about an 
equilibrium position. An example of small oscillations in atomic systems are 
the small oscillations of atoms in a molecule (see §41 of Part III). 

Another no less important example is the thermal motion of a crystal, 
which amounts to a set of linear harmonic oscillators. We shall also deal with 
the problem of the harmonic oscillator in quantum electrodynamics, where 
an arbitrary electromagnetic field is represented in the form of the superposi- 
tion of independent quantum oscillators (see §101). 

The above examples show that the theory of the linear harmonic oscillator 
is one of the important problems of quantum mechanics. 

The potential energy of a linear harmonic oscillator is given by the well- 
known formula U = 4w2x2. Hence the Schrödinger equation (6.6) for the 
linear harmonic oscillator has the form 


2 g2y Doe) 
Bilis. AY ROE ph (10.1) 
2m dx? 2 
In solving it, it is convenient to go over to dimensionless variables 
1 
mw \2 2E 
= ae ; at 10.2 
£ ( 7 ) 23A À Ts (10.2) 
In the new notation the Schrödinger equation takes the form 
2y 
-1Y + py =y. (10.3) 
dé? 


An important difference of the oscillator from the examples considered 
before is the fact that in this case the motion of the particle is not restricted 
by an impenetrable wall. Hence the oscillator has no boundary conditions 
similar to the conditions (8.1). The only requirement imposed upon the wave 
function of the oscillator is the requirement that it should be quadratically 
integrable. We shall see that the Schrödinger equation for the oscillator has a 
solution satisfying this requirement only for certain definite values of the 
parameter À. These values are called the eigenvalues of eq. (10.3). 

In order to explain the general character of the solutions of this equation, 
let us consider the asymptotic behaviour of Y(¢) for very large values of the 
argument £ > À. 

For >À in eq. (10.3) one can drop AW as compared with £2. We then 
have, obviously, 


d2y 2 
a =0. 
dg? i (10.4) 


§10 THE QUANTUM-MECHANICAL OSCILLATOR 39 


The solution of this equation satisfying the requirement of finiteness for 

large — is the function 
1,2 
W=AeMe 35 , (10.5) 

where A is a constant, and 7 is an arbitrary finite number. | 

The second independent solution of eq. (10.4), Y ~ etit increases inde fi- 
nitely for £ > œ and must be dropped. 

We try to find a solution of eq. (10.3) in the form 


V= fE), (10.6) 


where SE) is a new unknown function which for £ > œ% behaves as £’”". Substi- 
‘tating (10.6) into (10. 3), we arrive at the following equation for the function 


f: 





2f 
Lat 
dss dé 


Since the point £ = 0 is not a singular point of eq. (10.7), we seek a solution 
of this equation in the form of a power series 


SE) = R2 apk“ (10.8) 


Lee: re 0. (10.7) 


The derivatives df/dé and d2f/dé2 have the form 
a 
= Drepa, Tea ŽO k(k—1)apk-2. (10.9) 


Substituting the series (10.9) into eq. (10.7), we obtain 
2O k(k—1) apik -2—24 Dy katk! + (A-1) Daz tk = 0. (10.10) 


In order that a power series of the form ©,,c,,&” be identically equal to zero, 
it is necessary that all coefficients c,, reduce to zero. Assuming the coefficient 
of £K to be equal to zero, we obtain the recurrence formula 


2k +1—Xr 


"k+2~ (k42)(k +1) “k (10.11) 


It is easily seen that for > œ such a series behaves as ef, since in this case 
large k are essential and (10.11) gives a, 45 ~ (2/k)a,. In this case the func- 
tion W of (10.6) increases indefinitely. Such a solution must be eliminated. 

We obtain a solution satisfying the necessary conditions of finiteness and 
behaving for £ > œ as (10.5) only in the case where the series (10.8) reduces 





a  —&<«€  - O i 


40 THE SCHRODINGER EQUATION Ch. 2 


to a polynomial, i.e. is cut off at a certain term. Thus suppose that a, #0, 
@,,+2=90. Then all subsequent coefficients also vanish, and the function f 
reduces to a polynomial of the mth degree. 

It follows from (10.11) that in this case the condition 


2n + 1-A=0 (10.12) 


is necessary, where 77 is an integer, n > 0, since n is the ordinal number of the 
term at which the series ends. 
Substituting this value of À into (10.2), we obtain 


E,,=hw(n +4). (10.13) 


Hence it is seen that the energy of the oscillator can only take on discrete 
values, and the energy levels are equally spaced at intervals of hw. 
We write the wave function corresponding to the mth excited energy level 


in the form 
WE) = Ane fy), (10.14) 


where f„(¢) is a polynomial of the nth power with coefficients which are 
defined by the relation (10.11), and A, is a factor determined by the normali- 
zation condition. The polynomials f,,(£) are called the Chebyshev—Hermite 
polynomials and are denoted by H,,(&). The Chebyshev—Hermite polynomials 
are often written in the form 
2 
nee 
H,(é) = (-1)"e® diezin (10.15) 


dé” 
They satisfy the differential equation 
d?H, 





dH, - 
—2¢ —" nH, =0, (10.16) 
dé 
which is obtained from (10.7) taking into account condition (10.12). 


We give the first four Chebyshev—Hermite polynomials: 


Ho(E)=1, Hy(—)=2E, a(E)= 447-2, 


(10.17) 
H3(£)= 883-12,  H4(Œ)= 16&4—48E2 + 12. 


Knowing the general form of the Chebyshey—Hermite polynomials, one can 





§ 10 THE QUANTUM-MECHANICAL OSCILLATOR 41 


calculate the normalization integral. One then obtains* for A,, 


A AI : J (10.18) 
g fin APA 


The form of the wave functions for different quantum numbers n is shown in 
fig. V.3. We note that the wave function corresponding to the ground state 
n=0 of the oscillator is nowhere equal to zero. The wave function corres- 
ponding to the level n = 1 is equal to zero once, at x = 0, while W(x) (7 = 2) 
is equal to zero twice, and so on. The points at which the wave function is 
equal to zero are called the nodes of the wave function. It is easily seen that 
the number of nodes of the wave function is equal to the quantum number n. 














Fig. V.3 


This statement is not specific to the oscillator. It can be stated** that in 
general in the one-dimensional case the number of nodes of the wave function 
is determined by the quantum number n. The probability of finding the parti- 
cle at the point x in the interval dx is equal to 


W,,(x) dx = |wW,,(x)|? dx . 


These probabilities for different are shown in fig. V.4. Let us compare the 
expressions obtained with the probability of finding the particle at a given 
point as calculated by means of classical mechanics. The latter is defined as 
the ratio of the time dż spent in the neighbourhood of the given point to the 
period of the motion. The classical probability turns out to be highest in the 


* Sec, for example, L.D.Landau and E.M.Lifshitz, Quantum mechanics, Non- 
relativistic theory (Pergamon Press, Oxford, 1965). 
** R Courant and D.Hilbert, Methods of mathematical physics, Vol. 1 (Interscience, 
New York, 1953). 











Fig. V.4a Fig. V.4b 


neighbourhood of the turning points x = + Xp, at which the velocity of motion 
reduces to zero. On the other hand, in the neighbourhood of the point x =0 
the particle has its largest velocity and the probability of finding it is a 
minimum. 

It is seen from the curves of fig. V.4 that the probability of finding a 
quantum particle differs from zero even in the region outside the turning 
points, which is unattainable in classical mechanics. For large quantum 
numbers (fig. V.5) the quantum probability distribution approaches the classi- 
cal one, in agreement with the correspondence principle. 











§11 THE THREE-DIMENSIONAL OSCILLATOR 43 


In conclusion we note that the lowest possible energy value of the oscil- 
lator, equal to Shu, is different from zero. This means that the quantum 
oscillator can never be in a state of absolute rest. This fact in its turn is asso- 
ciated with the uncertainty principle. In order of magnitude the energy of the 
oscillator is 











Ap? , mw? Ap? mw? (h \2 
Fo oe + Ax2> 22 + a 
2m 2 2m 2 Ap 


Considering this quantity as a function of Ap, it is easily established that it 
has a minimum for Ap ~ (mwh)? and is of the order of magnitude of hw. 
Experimentally the zero-point energy Eg is observed in the scattering of light 
by a crystal at a temperature close to absolute zero. At absolute zero the 
crystal is in the ground (lowest energy) state. Nevertheless, atoms perform 
zero-point oscillations which cause scattering of the light. 


§11. The three-dimensional oscillator 


Let us now consider the motion of a spatial three-dimensional oscillator. 
For generality we shall assume that the natural frequencies are different in 
three mutually perpendicular directions and equal respectively to w], w and 
w3. Then the potential energy is expressed by the formula 











mot, mus mo n 
U= 7 * + ay + z oe (11.1) 


The Schrödinger equation correspondingly has the form 


fi) oT) nena.) We Oo 
om vi Y ts (wr + way + w3 Y =EY. (11.2) 


We try to find the solution of eq. (11.2) in the form of a product of functions 
each of which depends on only one coordinate 


Yæ, y, z) = V(X) Yo) VE). (11.3) 
Substituting (11.3) into (11.2) and separating the variables we obtain 
ee St ODED, AEDS CA) 


where xj =x, x2 =y, x3 =z; Ej +£,+£3=E. 
Thus the problem is reduced to the one-dimensional case. in correspon- 





——— = 


44 THE SCHRODINGER EQUATION Ch. 2 


dence with this, making use of (10.13), (10.14) and (10.18) we can write 
m3 091 wwz 4 jg-(ytngtn3)y 4 

eee) aaa) 

X expl -4 (E? + £3 + ED] Hy, ED Hy, (Eo) Hng(E3) » (11.5) 


where £;= (mw;/h)} x; (i= 1, 2, 3). 
The total energy of the oscillator is equal to 


Ynimn X2, x3) = ( 
ny!ng!n3! 


E = he (n; + 3) +ñw(n + 3) + hw3(n3 + 4). (11.6) 


In particular, for an isotropic oscillator, which has w] = w= w3 = w, the 
total energy is of the form 


E, =ho(n +3), where n=n,+n2+n3, (11.7) 


ie. E„ depends on the sum of the quantum numbers 7}, 73 and ng. This 
means that a given energy value (given 7) can be obtained from different 
combinations of n}, mz and n3. Hence it follows that all the energy levels, 
except the ground level 7 = 0, are degenerate. The multiplicity of degeneracy 
is easily calculated. For this we fix, in addition to n, the quantum number n]. 
Then the number of possible combinations of n}, 73 and 73 will be equal to 
the number of possible values of 73, i.e. equal ton—n, + 1, since 1 can vary 
from zero up to n—n,. Summing the expression obtained over all possible 
values of the number z}, we find the total number of combinations of the 
three quantum numbers 74, 73 and 73 which add up to the given number n, 
i.e. the multiplicity of degeneracy of the nth energy level: 


n 
D (n-ny + 1)=4(n+ I) (n +2). (11.8) 


ny=0 


§12. Reflection from and penetration through a potential barrier 


Among the other relatively simple problems of quantum mechanics we 
shall consider the motion of particles in a field of force which has the form of 
a potential barrier. This means that the forces act on the particle in a certain 
limited region of space. Outside this region the particle moves as a free parti- 
cle. We shall see that the study of the motion of particles in a field having the 
form of a barrier of the simplest shape will allow a number of important and, 
in principle new, properties of quantum particles to be exhibited. We shall 


§12 POTENTIAL BARRIER. REFLECTION AND PENETRATION 45 





Fig. V.6 


begin our consideration with the simple rectangular one-dimensional barrier 
of infinite extent shown in fig. V.6. In classical mechanics any particle moving 
from left to right with an energy smaller than the barrier height Ug is comple- 
tely reflected from the potential wall. The region x > 0 is inaccessible to it, 
since in this region the total energy of the particle would be less than the 
potential energy. This would mean that the kinetic energy must be negative, 
which is evidently impossible. If, on the contrary, Æ is larger than Up, then 
according to the laws of classical mechanics the particle passes freely above 
the barrier, moving in the region x > 0 with a lower kinetic energy equal to 
E-Up. 

Let us now consider the motion of a particle in the same situation accord- 
ing to the laws of quantum mechanics. For this we write the Schrodinger 
equation for the stationary states of the particle in the field of a barrier of 
infinite extent 


2 avd] 
= Y + U(x) =EV, (12.1) 
2m dx- 





where U is the potential energy whose graph is shown in fig. V.6. The solu- 
tions of eq. (12.1) are conveniently considered in two different regions. 
Region I ranges from x =— to x =O, and region II from x =0 to x=, 

We write the Schrödinger equation for each of the regions mentioned: 


2y 
oY ky =0, x<0, 
dx2 


12.2 
ue (12.2) 


2 


+k'2v=0, x>0, 


el 
wanes) 


gar 


an meee- 


pe oa 


aoe 


| l 





46 THE SCHRÖDINGER EQUATION Ch. 2 


where the following notation is introduced: 
k2= 2mE k'2= 2m 
h2 h2 
The solutions of these equations are respectively written in the form 
W(x) =A eik + Bye ike x<0, 
W(x) =AnelkKX+ Boek  x>0. 


(E-Up) . (12.3) 


(12.4) 


In these formulae a term of the form ei* represents a plane wave propagating 
in the positive direction of the x-axis, and e—Ikx represents a plane wave prop- 
agating in the opposite direction. The amplitudes A ,, B}, A and B3 are inte- 
gration constants. We define a flux of particles incident on the barrier. Let jo 
be the incident particle flux density. Then, according to (7.6), 


. _hk 
Os wat 14,12. 


For simplicity we choose the flux such that we can set A, = 1. 

In order to define the other constants we consider the behaviour of the 
wave function at the boundary of regions I and II at the point x =0. By 
virtue of the general conditions imposed upon the wave function and its 
derivative (see §6), they must remain continuous even at the point of discon- 
tinuity of the potential energy. Hence for x = 0 the following equalities must 
hold: 


y(+ 0)= ¥(—0) , : (12.5) 


y'(+ 0) = y'(-0). . ; (12.6) 


From relations (12.5) and (12.6) one can also define the two integration 
constants A> and Bj. As to the constant B2, we must set By = 0. As a matter 
of fact, we only define the particle flux propagating in the positive direction 
of the x-axis. For E > Up (i.e. for real k’) the term of the wave function 
proportional to e-1** represents a plane wave propagating in the opposite 
direction. The reflected wave propagates in region | in the negative direction 
of the x-axis. Obviously, the reflected wave is not present in region II and, 
consequently, the wave propagating from the right to the left is absent in this 
region. Hence we have to set the amplitude B of this wave equal to zero. But 
if E<Up (k is a purely imaginary quantity), then the function e—ikx 
increases exponentially as x > 4%, which contradicts the condition of finite- 
ness of the wave function. By virtue of this the coefficient By must also be 
equal to zero for imaginary values of k’, i.e. for E < Up. 


un 
N 


POTENTIAL BARRIER. REFLECTION AND PENETRATION 47 


Let us consider the case where the total energy of the particle is larger 
than the potential barrier height, £ > Up, in more detail. 
Taking into account (12.4), from relation (12.5) and (12.6) we have 


EEB ISAS k(1—B;) = k'A2. 
From these equations we find the amplitudes 4, and By: 


ale meek 
norae Boro 





(12.7) 


We see that B}, the amplitude of the reflected wave, is different from zero, 
although Æ > Up. This fact is due to the wave properties of particles. The 
wave is in part reflected, and in part passes into region II. The ratio of the 
flux density of reflected particles ję to the flux density of incident particles 
jo will be called the reflection coefficient R. Correspondingly, the ratio of the 
flux density of transmitted particles jp to the flux density of incident parti- 
cles will be called the transmission coefficient D. Taking into account (12.4), 
we find 


, alk p 2 ; AK 4 
UR IB\IF , In) = |AdI-. 


Since jg =hk/m, we obtain 


k—k'\2 4kk' 
R= (£) 3 D = —. . (12.8 
k+tk (k+k')2 ( ) 


We see that the following relation is automatically fulfilled: 
R+D=1. (12.9) 


This relation expresses the law of conservation of the number of particles. 

We note that the expressions (12.8) turn out to be symmetric with respect 
to k and k’, i.e. for particles of a given energy Æ the reflection coefficient (as 
well as the transmission coefficient) turns out to be independent of the direc- 
tion of motion of the particles. Particles moving from the left to the right, i.e. 
against the action of the force at the point x = 0, have the same probability of 
being reflected at this point as particles of the same energy moving from the 
right to the left, in the direction of the action of the force at the point x = 0. 
This fact is also due to the wave character of the process and has a cotres- 
ponding optical analogy. 

Let us now consider the case E < Up. Then k' is a purely imaginary quanti- 





48 THE SCHRODINGER EQUATION Ghz 
ty which is conveniently written in the form k’ = ik, where 
mall 1 
k =>, [2m(Up-E))? . (12.10) 


The amplitude of the reflected wave B} turns out to be a complex quantity, 
and the reflection coefficient R is equal to 


he 
R= 1B)? =| =i, = R= 01. (12.11) 


The reflected wave is written in the form 





Weenie 2: ve i 
VR= ri e-ikx = ẹ—ilkx+8) | (12.12) 


ie. the reflection leads to a shift of the phase of the wave. From (12.12) it 
follows that this shift is equal to 





6 = arc tan 2k ; (12.13) 
k2—K2 


Although the reflection is complete, nevertheless the wave function in region 
Il is different from zero and has the form 





= lee = —KX ‘9) 
W(x) =Ape Eran (x > 0). (12.14) 
Correspondingly the probability density for finding the particle at point x in 
the region x > 0 is equal to 





2 4k? —2kx 9 
ive)I Pare ) (12.15) 
We see that the behaviour of quantum particles differs essentially from 
that of classical particles. For a particle moving according to the laws of classi- 
cal mechanics the region x > 0 for E < Up was forbidden. On the contrary, a 
particle moving according to the laws of quantum mechanics can, with a 
certain probability, penetrate this region. The penetration of particles into 
the region of forbidden energies represents a’specific quantum effect which is 
called the tunnel effect. As is seen from the formula, the effective depth of 
penetration into region II, i.e. the distance 5x from the boundary of region II 
at which the probability of finding the particle is still considerably different 
from zero, is of the order of magnitude of ôx ~ x-1. For x > ôx the proba- 
bility density (12.15) turns out to be exponentially small. 
Let us estimate the effective depth of penetration for an electron, assum- 


kia 


§12 POTENTIAL BARRIER. REFLECTION AND PENETRATION 49 


ing that Up—E ~ 1 eV = 1.6X10~!2 erg. For ôx we evidently have 


E27 
Ox 4 = 10 ~ 108cm. 


(2m(Up—E)]? [2X 10-27X1.6X 10-12]2 





This estimate shows that the effect can be significant only in the realm of 
microscopic dimensions. Thus, as was to be expected, the tunnel effect can- 
not be observed in the motion of macroscopic bodies for which the laws of 
classical mechanics are valid. 

In order to actually observe the particle in region I, we have to localize it 
there in a certain small interval Ax < 5x. By localizing the particle we change 
its state (its energy), since by virtue of the uncertainty relation Ap > h/Ax > 
> hk. The particle which we shall detect somewhere in region II will no longer 
possess the initial energy Æ. The uncertainty in the momentum is related to 
the uncertainty in the kinetic energy of the particle: 





Substituting here the expression for x (12.10), we obtain 
AT>Up-E . 


Thus the uncertainty in the energy of a particle localized in the region behind 
the barrier is larger than the energy that it lacks to reach the barrier height. 





Fig. V.7 


Let us consider briefly a barrier of finite extent, as shown in fig. V.7. 
Particles are incident on the barrier, moving in the positive direction of the 
x-axis. We can immediately write the wave function for the three different 








i i es 


50 THE SCHRÖDINGER EQUATION Chaz 


regions: 

W(x) = elk + Bye x<0, U=0, I (12.16) 
W(x) =AzneiKX + Boek O<x<a, U=Uy) Il (12.17) 
W(x) =A , 2Sa, WwW, II (12.18) 


In this case (as in the case of a barrier of infinite extent) we again set the 
amplitude of the incident wave equal to unity. Since in region III there is no 
reflected wave, only the wave propagating in the positive direction of the 
x-axis is taken into account. We write the conditions of continuity of the 
wave function and of its first derivative at the boundaries of the regions, 


analogous to (12.5) and (12.6): 
WG O)=¥(-0), - w(+0)= (0), 
W(a+0) =W@—0), y'(a +0)= y'(a—0). 
Substituting (12.16), (12.17) and (12.18) into these relations, we obtain a 
system of equations with respect to B4, A2, By and A3 


(12.19) 


1+B,=A5+B>, Aneika + B,e-ik'a = A eika | 
1 D2) 2 2 3 (12.20) 


k(1—B,) = k'(A-B) A A eika p eika = E Azeika : 


Let us consider immediately the most interesting case, Æ < Ug. If the 
motion of particles proceeded according to the laws of classical mechanics, 
then the barrier would be completely impenetrable for them, and at the point 
x =0 the particles would undergo total reflection from the barrier. The situa- 
tion is different in the case of microparticles, whose motion is described by 
the laws of quantum mechanics. Solving the system (12.20) with respect to 
A, and taking into account that k’ = ix, where K is defined by (12.10), we 
obtain i 
KEE (12.21) 

(k + ik)? eX@—(k—ik )2e—*@ 
The amplitude of the plane wave turns out to be different from zero in the 
region behind the barrier, although the energy of the particle is smaller than 
the barrier height £ < Up. This means that a microparticle can, with a certain 
probability, pass through the potentiai barrier by means of the tunnel effect. 

The tunnel transmission of particles, which earlier seemed to be paradoxi- 
cal, is at present not only observed experimentally, but plays a fundamental 


A3 


§12 POTENTIAL BARRIER. REFLECTION AND PENETRATION 51 


role in a number of fields of physics, in particular in nuclear physics. It 
suffices to say that tunnel passage through a barrier is associated with the 
a-decay of radioactive nuclei and with the phenomenon of spontaneous 
fission of uranium nuclei. The tunnel effect is also related to the phenomenon 
of emission of electrons from cold metals in a strong electric field and to a 
number of other processes. 

Let us define the coefficient of transmission D of microparticles through a 
barrier 


j 2 
eS E (12.22) 
Jo (k* + «-)* sinh ka + 4k4K4 
If ka > 1, then sinh ka ~ $e*@ and the expression (12.22) is simplified: 
Dee 
mw Loki KA —2ka (12.23) 
(k2 + K2)2 


The basic dependence of the transmission coefficient on the width and 
height of the barrier is defined by the exponential function e™?*%4, Denoting 
the factor multiplying this by Dg, we have 


2 1 
D= Do exp] - Fi [2m(Uo-E)]ta | d (12.24) 
We see that the probability of passing through a barrier is not too low if 
D 
= [2m(Up-E) |? aS 1. (12.25) 


The condition (12.25) can evidently be fulfilled only in the realm of micro- 
phenomena. If we substitute into (12.25) values on a nuclear scale a ~ 10-13 
cm, m ~ 10-24 g (nucleon mass), Ug-E ~ 10 MeV (10-5 erg), ‘then carrying 
out the estimation we find that D ~ e—!. Thus the particle can with consider- 
able probability pass through a barrier whose height exceeds its energy by 
5—10 MeV. A completely different result is obtained for the same particle 
and the same barrier height, if the spatial extent of the barrier amounts to 
a~ lem. Then D~ 10-13. This means that in the realm of macroscopic 
phenomena the effect of tunnel transmission is virtually absent. The proba- 


bility of tunnel passage through a barrier of arbitrary form will be considered 
in §42. 








52 THE SCHRODINGER EQUATION Ch. 2 


§ 13. One-dimensional motion* 


In this chapter we have considered a number of simple problems of quan- 
tum mechanics regarding one-dimensional motion. The three-dimensional 
problems considered also reduce to one-dimensional problems, because the 
Schrödinger equation with the potential U(x, y, z) = U\(x) + Ua) + U3(z) 
reduces to one-dimensional equations with the potentials U}, U and U3 
respectively. The results obtained allow one to draw some general conclusions 
on the properties of the one-dimensional motion of particles. 

For one-dimensional motion the energy levels can belong to a discrete 
spectrum (see §8—11) as well as to a continuous spectrum (§ 12). As we have 
seen, there correspond to the states of the discrete spectrum quadratically 
integrable wave functions, i.e. wave functions for which the normalization 
condition can be written in the form f| V,(x)I? dx = |. This condition indi- 
cates that the motion is finite, i.e. that the probability of observing the parti- 
cle at arbitrarily large distances is negligible. 

On the contrary, if the particle can go off to arbitrarily large distances, i.e. 
if the motion is infinite, its wave function is not quadratically integrable. It 
can be shown that in this case the energy has a complex spectrum. Suppose 
that the potential energy U(x) of the particle somehow varies from the value 
U(ee) for x>, which we shall choose as the zero energy, to the value 
U(—*) = Up for x > —°9. We shall assume for definiteness that Uo is positive, 
Up > 0. The function U(x) may vary quite arbitrarily. We suppose only that 
it has a minimum Unin < 0. Then for an energy £ such that U,,j, < E <0 the 
particle cannot go off to infinity. For these energy values the motion is finite, 
and the spectrum is discrete. The energy levels of a discrete spectrum are not 
degenerate. This statement is easily proved by assuming the contrary. Indeed, 
if it is assumed that Y} and Y3 are two solutions of the Schrodinger equation 
corresponding to one and the same energy value £, then they satisfy the rela- 


tion 
1 dy) 2m 1 d?y, 


i.e. 


* For a detailed treatment seè the book of L.D.Landau and E.M.Lifshitz, Quantum 
mechanics (Pergamon Press, Oxford, 1965). 


§13 ONE-DIMENSIONAL MOTION 53 


Integrating this relation with respect to x, we obtain 


dy, dv > 
Vas Yi gy Cot- (13.1) 


But at infinity Y} = Y2 =0. Hence the constant on the right-hand side of 
relation (13.1) is equal to zero and, consequently, 
dy, dy, 
Vitae - Vil ay 


Integrating once more with respect to x, we obtain Y3 = const.,. This means 
that the two functions describe one and the same state, i.e. degeneracy is 
absent. 3 
In the range O<E< Ug the particle can move arbitrarily distant in the 

direction of positive x. Therefore the motion is infinite, and the energy 
spectrum is continuous. The wave functions in this case are also not degener- 
ate. In fact, the preceding proof is again valid, since the wave functions 
reduce to zero when x > —°°. The asymptotic expressions for the wave func- 
tions for x > + are easily obtained from the Schrédinger equation 

ov 52 Ip Uw] =0 

Gye Ve 
if one substitutes here U = 0 for x >œ and U = Ug for x > —©. Correspond- 
ingly we obtain 


W(x) =A sin(kx+a@) for x>% (13.2) 


W(x) =B e for x> —% (13.3) 


where 


w P i 
k = 7 (2ME) and k =5,[2m(Up-E)}? 


ie. the solution has the form of a standing plane wave when x > ©, and is 
exponentially damped when x > —°9. 

In the energy range E > Ug the motion is infinite in both directions. The 
energy spectrum is continuous. Since the Schrédinger equation is of second 
order, it has two linearly independent solutions. In this energy range both 
solutions satisfy the necessary requirements. Therefore the energy levels are 
two-fold degenerate. The asymptotic expression for the wave function is of 
the form 


W=A elk + Ane ikx R (13.4) 





54 THE SCHRÖDINGER EQUATION Ch. 2 


where one term corresponds to the particle moving in the positive direction of 
the x-axis, and the other corresponds to the particle moving in the negative 


direction of the x-axis. 

We now assume that the field increases indefinitely, i.e. that |U(x)| > © as 
x —>+0co, As a simple and at the same time important example we shall 
consider the problem of the motion of a particle in a uniform external field 


U(x) = -fx . 


We choose the x-axis to be in the direction of the field; f denotes the 
force acting on the particle: f= —dU/dx. The potential energy is measured 
from its value at x =0, hence U(0)=0. The Schrédinger equation for the 
motion in such a field is of the form 


d?y | 2m = 
pe OY =O (13.5) 


We now introduce in place of x the new variable 


h2 i 
Correspondingly eq. (13.5) will have the form 


EY ays 0. (13.6) 
dn? 
The solution of eq. (13.6) can be expressed in terms of Bessel functions, but 
the expression of the solution of (13.6) in terms of the so-called Airy func- 
tion is more convenient. Namely, the solution of eq. (13.6), finite for all 
values of 7, has the form 


Y(n) = CP(—n) , (13.7) 
where (n) denotes the Airy functiuu. 


“®(n) = 1-4 f cos (Gu? + un) du , (13.8) 
0 


and C is the normalization constant. 

Thus the Schrédinger equation (13.5) has a solution satisfying the neces- 
sary requirements for any energy value Æ. Consequently, for motion in a 
uniform field the energy spectrum of the particle is continuous, which corres- 


§14 A SYSTEM OF PARTICLES 55 


ponds to infinite motion. In the given case U > —% as x > ©, i.e. the motion 
in the positive direction of the x-axis is infinite. 

The wave function (13.7) has a rather simple form for n > +. Making 
use of the known asymptotic expressions for the Airy function, we have 





¥(n) = exp (—3|nl2) for n> -—% (13.9) 


nik 


and 


Ym) =— sin Gn? +47) for noe. (13.10) 


z-a 


The constant C is defined by the normalization condition. Of course, the 
integral of the square of the modulus of the wave function (13.7) over all 
space diverges, which corresponds to infinite motion. Normalization rules for 
such cases will be discussed in §18. 


§14. The Schrödinger equation for a system of particles 


In the preceding sections we have considered the laws of motion of a 
single particle in an external field. But this greatly restricted the range of 
problems we could consider. As a matter of fact, even the simplest system, 
the hydrogen atom, represents, strictly speaking, a system of two particles. 
This holds all the more for systems such as many-electron atoms, molecules, 
atomic nuclei, matter in the solid state and so on. Generalizing the results 
obtained in §6, we formulate the fundamental equation of quantum mecha- 
nics: the Schrödinger equation for a system of N particles. It has the form 


N N 
a OW _ h2 2 
a (- Fe Viv +2 U)V + Vint}, os ty)Y - (14.1) 
Here the Laplacian v? 
2 2 2 
v? = EE + OIR oe 


ax? ay? az} 


t 


acts on the coordinates of the ith particle. U;(r;) is the potential energy of the 
ith particle in the external field, U;,, is the potential energy of the interaction 
of the particles with each other, and m; is the mass of the ith particle. The 
summation is carried out over all particles of the system. The wave function 





Se - eo a —_ 


56 THE SCHRODINGER EQUATION Ch. 2 


describing a system of particles, in accordance with §2, depends on the coor- 
dinates of all the particles and the time, W(t), r3, .... ty, £). 
The Schrédinger equation for stationary states has the form 


N 
2 (- A vv D uewe Unli» -> INDY = EY (14.2) 


As the simplest example of the integration of eq. (14.2) let us consider 
a system of particles which do not interact with each other, i.e. let us assume 
that the energy of interaction is equal to zero, U;,,=0. In this case the 
Schrédinger equation can be rewritten in the form 


N 
WD \e 
2 (- am? t UG) =Ey. (14.3) 


where the terms in each bracket depend only on the coordinates of the corres- 
ponding particle. We seek a wave function y in the form of a product of 
functions which depend on the coordinate of individual particles. 


y = Vlr) valra) ase Vatty) Fy (14.4) 


On substituting into the Schrödinger equation, we obtain 
N 
2 Vlr) -Wiii Vie Ci) - 


Unden) (AWR + UD) vile) = Ev 


Dividing the right and left sides of the equation by Y, we find 





1 a 
AVO -5 om ——V; 2+ uj) Wr) =E. 
There is a constant quantity on the right-hand side of the equation. The left- 
hand side is made up of a sum of terms each of which is a function of its 
independent variable. In order that the equation hold for all values of the 
independent variables, the following conditions must be fulfilled: 


2 
Sap Vile) + UV = Evie), È EE, 


§14 A SYSTEM OF PARTICLES 57 


where the £; are constants, which, as is easily seen, represent the energies of 
individual particles. 

Thus, if the left-hand side of the Schrödinger equation can be written in 
the form of the sum (14.3), then the wave function of the system resolves 
into a product of wave functions, while the energy of the system appears as 
the sum of the energies of the individual particles. 

These results have a simple physical meaning. We have assumed that the 
energy of interaction between the particles is equal to zero. It is therefore 
natural that the total energy of the entire system is made up of the sum of 
the energies of the individual particles, since the motion of each of the parti- 
cles is independent of the motion of the other particles. The probability of 
observing a given set of coordinates of the particles is written in the form 


dW(ry, -> ty) = IYDI? -Wya dV]... AdVy. 


The above result is in complete agreement with the theorem of multiplication 
of probabilities of independent events. 

Further, let us consider in more detail a system of two particles with 
masses mı and mz. We assume that the potential energy of interaction 
depends only on the distance between the particles, and that there is no exter- 
nal field. In this case the Schrödinger equation for stationary states has the 
form 


VI Wry. r2) 5 — V5 (ry, ra) + 


- im 


+ U(lri=r2|) W(t], r2) = EY (r1, r2) - (14.5) 


We transform this equation by introducing new coordinates R and r which are 
defined by the relation 


mır] ar moro 


m, +m,’ r=rj—rp. (14.6) 


We note that the new variables are completely analogous to the coordinates 
of the centre of mass and those of relative motion in classical mechanics. As a 


result of somewhat lengthy but not complicated transformations the 
Schrodinger equation takes the form 


ax v- yay + Uy = Ey. (14.7) 





58 THE SCHRÖDINGER EQUATION Ch. 2 


Here M and pare the total and reduced masses of the system 
mm 


M=m,+mp, Hint: (14.8) 


We see that the left-hand side of the Schrödinger equation is resolved into 
a sum of two terms and has a form similar to eq. (14.3). In this case the 
Schrodinger equation can be written in the form 


WF, r2) = PR) Vor) - (14.9) 
Substituting (14.9) into (14.7) and repeating the transformations which 
were carried out before, we obtain 


A? 4 

-7M VRP ERY» (14.10) 
h? 5 y 

-zp V r¥0 + VOVo= Evo, (14.11) 
ERtE,=E. (14.12) 


Equation (14.10) is the Schrödinger equation for a free particle with mass M. 
Its solution is the function 


YR) = AeWAP-R) , (14.13) 


The quantity Ep = |P|2/2M represents the kinetic energy of motion of the 
system as a whole. 

Thus corresponding to (14.9) and (14.13) the solution of the Schrödinger 
equation can be written in the form 


Wry, r2) = Ae P-R) y (r) . (14.14) 


From the formulae obtained it is seen that the centre of mass of the 
system moves in space as a free particle, while the relative, motion of the 
particles proceeds independently of the motion of the centre of mass and is 
described by the function Wo satisfying eq. (14.11). The total energy of the 
system is made up of the energies of the relative motion and the motion of 
the centre of mass. Hence we see that in quantum mechanics, as in classical 
physics, the problem of the motion of two particles whose potential energy 
of interaction U depends only on the distance between them, U(|r,—r9|), 
reduces to the problem of the motion of a single particle with reduced mass 
Hin an external field U. 





The Mathematical Apparatus 
of Quantum Mechanics 


§ 15. Linear operators 


We have seen above that in solving the Schrédinger equation one can find 
the wave functions and the energy of a system. The latter in certain cases 
(e.g. a particle in a potential well) has a discrete sequence of values, while in 
other cases (e.g. a free particle; a particle passing through a barrier) it has a 
continuous sequence of values. 

Knowing the wave function Y, we could obtain the probability of finding 
the particle at a given point of space, as well as the mean values of the quanti- 
ties depending on the coordinates. Then, as was shown in §4, the coordinates 
and the corresponding momentum components of the particle have no defi- 
nite simultaneous values. However, the mathematical apparatus we used was 
inadequate to solve a number of important problems. As examples we put 
these questions: Which quantities cannot simultaneously have definite values? 
How are the mean values of the quantities which are not functions of the 
coordinates to be found? What characteristics of a quantum-mechanical 
system must be given in order that its state is completely defined? 

The peculiarity of the problems of quantum mechanics required the 
development and application of a special mathematical technique. 

The mathematical technique of quantum mechanics must correspond to 


59 








60 THE MATHEMATICAL APPARATUS j Ch. 3 


the physical statement of its problems. It turned out that the corresponding 
mathematical technique — the theory of linear operators — had already been 
worked out in mathematics. We shall first consider the basic concepts of this 
theory, and in what follows we shall show how the theory of linear operators 
can be associated with the problems of quantum mechanics. 

We shall understand by an operator a rule by which a function y(x], x3, 
X3,...) of the variables x, X2, X3, ... is related to another function x(x), x2, 
X3, ...) of the same variables. 

In what follows we shall denote operators by means of letters with the 
sign `, for example Ê. By means of the symbol F the rule of the transition 
from the function y to the function x can be written in the form 


x=Fe. (15.1) 
Let us consider some simple operators. 


The operator Ê can, for example, denote differentiation with respect to 
any variable - 


ð 
X(% p Xz -) =a V(X}, X2 ---)- 
1542 ax; 1>*2 
This operator is written symbolically as follows: 
pO 
fae. 
The differential operators will be encountered particularly often. The opera- 


tor F can also denote multiplication by any quantity, raising to a power, and 


so on. 
We define the operator of the independent variable x as the multiplication 


by this variable: x = xy. 
An integral relation between the functions y and x can also be represented 
in operator form: 


x(x) = K(x, E)E dé = Fe . (15.2) 


The function K(x, £) is called the kernel of the integral operator Ê. 

We note that earlier we have also used differential operators: the operator 
V= i(ð/əx) + j(0/Ay) + k(ð/ðz), the Laplacian V? = 3?/ðx? + 02/ay2 + 02/az2 
and others. 

We now define a linear operator F as an operator for which the equalities 


Fy, +9) = Fo, + Foy e (15.3) 
ÊCo = CFo (15.4) 


arr 


§15 LINEAR OPERATORS 61 


are fulfilled, where C is an arbitrary constant. Hence it follows that 
F(Cy 91 + Copp) = Cy Fo, + CÊ, (15.5) 


where C},C are arbitrary constants. It is obvious that the operators 
mentioned above are linear operators. 

For reasons which will be clear from what follows we shall, in quantum 
mechanics, deal only with linear operators. 

By combining two given operators F and R one can define their sum and 
product. We shall understand the sum of the operators F and R to be the 
Operator G defined by the relation 


G=F+R, Gyo=FortRo. (15.6) 


We shall understand the product of two operators Ê and Ê to be the opera- 
tor L = FR consisting of the consecutive application of the operators R and F, 


Ly=F(R¢g) . (15.7) 


If one first applies the operator F and then the operator R, their product will 
be the operator L'=RPF, 


L'g= R(Fe) . (15.8) 


We note that the operators Ê and Ê’, generally speaking, are not identical to 
each other, i.e. the product of the operators depends critically.on the order of 
the factors. Corresponding to this, the algebra of operators is an algebra of 
non-commuting quantities. Two operators are said to commute with each 
other if the product of the operators does noť depend on the order of the 
factors; otherwise the operators are said to be non-commuting. As an example 
letus find the product of the operator of differentiation with respect to x and 
the operator of multiplication by x for both orders of the factors, i.e. let us 
assume that Ê= x, R = d/dx. In correspondence with (15.7) the operator of 
the product L will be L = x(0/0x). We now find the operator L'=RF 


iy =È @y)= (: +x2) Yi 


We see that in this case the operator L' is equal to 


s ð 
TESES 
ox 
and is not identical to the operator L. Thus the operators È and L' do not 
commute. Making use of the expressions obtained for L and-L’, we can write 





62 THE MATHEMATICAL APPARATUS Ch. 3 


that 


doa 
Ox ax 


It is natural to call the right-hand side of this operator relation the unit opera- 
tor. If we took the operator of multiplication by any other independent 
variable, say y, as the operator F, then it would turn out that the operators 


ð/ðx and y commute 


Fy wg 
Wien a gO. (15.9) 


For certain operators it turns out that the following relation holds: 
ÊR =-ŘÊ. (15.10) 


In this case the operators F and R are said to anticommute. We shall call the 
operator FR—RF the commutator of the operators Ê and R, and shall denote 


it by brackets, i.e. 
FR-RF = {FR}. - (15.11) 
An operator Ê can be contrasted with the inverse operator f—!. The 
inverse operator is defined by the relations 
HLA pe EBAY =p 
or 
FR-l\=F-\F=1. 0 7 (15.12) 


If F is a differential operator, then the inverse operator Ê=! has the form of 
"an integral operator. Indeed, suppose that the relation 


FY(x) = (x). (13.13) 


holds. Then, acting on the right-hand and left-hand sides of this gauality with 
the operator F—1, we obtain 


W(x) = F-ly(x) . (15.14) 
On the other hand, the relation (15.14) can be written in the form 
Y@)= [GC xy), (15.15) 


where the function G(x, x’), called the Green’s function of eq. (15.13), 


§16 EIGENVALUES AND EIGENFUNCTIONS OF OPERATORS 63 


satisfies the relation 
BG Cox) = 8 œx). (15.16) 


Indeed, if we act with the operator F on the right-hand and left-hand sides of 
the equality (15.15), then under the condition (15.16) we again arrive at the 
relation (15.13). Comparing (15.14) and (15.15), we see that the Green’s 
function G(x, x’) is the kernel of the integral operator F-\. The Green’s 
function G(x, x’) is not determined unambiguously from eq. (15.16). For a 
single-valued determination it is necessary to give in addition certain condi- 
tions of the nature of boundary conditions. 

The relation (15.13) can be considered as an equation relating the function 
W(x) to a given function y(x), and whose solution is given by formula (15.15). 
It should only be borne in mind that in order to obtain a general solution we 
have to add the general solution W(x) of the homogeneous equation 
F(x) = 0 to (15.15). We then have 


YO) = Volx) + [GC xo’) a. (15.17) 


We shall need the above relation in what follows. 


§16. Eigenvalues and eigenfunctions of operators 


Let us consider the operator relation 
Fv =Fy. y (16.1) 


Relation (16.1) means that if the operator F is applied to the function y, 
then one again obtains the function y multiplied by a certain constant F. It is 
obvious that for a given form of the operator F the relation (16.1) cannot be 
satisfied by every function wy. In other words, relation (16.1) is an equation. 
The form of the function y can be obtained by solving eq. (16.1). If the 
operator F is a linear differential operator, then eq. (16.1) will be a differen- 
tial equation. Since from the form of the equation it is immediately clear that 
y = 0 is its trivial solution, (16.1) represents a linear homogeneous differential 
equation. The study of such linear homogenecus equations is the most impor- 
tant problem of the theory of operators. 

In what follows we shall not be interested in arbitrary operators F and 
arbitrary functions y, but only in functions which satisfy certain definite 
conditions: 

(1) the function yY must exist over the entire range of the independent varia- 


Sine te ans ME: 





64 THE MATHEMATICAL APPARATUS Ch. 3 


bles. For example, in the case of Cartesian coordinates, in the range 
— LX Koo, — <y Loe, — <z < oo; 

(2) in the region of existence the function y must be finite and continuous, 
with the exception, in some cases, of singular points; 

(3) the function y must be single-valued. 

We shall call the set of conditions (1)—(3) the standard conditions. It turns 
out, generally speaking, that eq. (16.1) has solutions which differ from the 
trivial one, and which satisfy the standard conditions, not for all values of the 
parameter F but only for certain selected values of it. The selected values of F 
for which the non-trivial solutions of eq.(16.1) exist are called the eigenvalues 
of the operator Ê, and the corresponding solutions of eq. (16.1) are called the 
eigenfunctions of the operator Ê. 

We shall first of all present the problems on eigenfunctions and eigenvalues 
with which we are already acquainted. 

(1) In considering the problem of the motion of a particle in a potential 
well we have solved eq. (16.1) with the differential operator F = —d?/dx?. The 
boundary conditions led to the eigenvalues (8.5) and to the eigenfunctions 
(8.7) of the operator F. 

(2) If, for the same form of the operator, we require no reduction of y to 
zero at the boundaries of the interval (0,/), then the solutions of (8.2) will 


have the form 
y = Aeikx + Be—ikx | 


If k2>0, then for all values of x the function Y is finite, so that the solution 
satisfies the standard conditions. For negative k? 
_———— 


W = Ae-kX + Be , where -k=ik , 


there are no solutions which satisfy the standard conditi 


(3) In the problem of the oscillator we have considered the solution of eq. 
(16.1) for the operator (see (10.3)) 


The problem has the solution F = 2E/Aw = 2n + 1.- 

It is clear from the examples given that the whole set of eigenvalues of an 
operator, which we shall call its spectrum, can be discrete (example | and 3) 
as well as continuous (example 2). It can be proved that the eigenfunctions 
which correspond to the discrete spectrum of the eigenvalues are quadratically 
integrable, i.e. the integral f|y|2dV converges. The eigenfunctions corres- 
ponding to the continuous spectrum of the eigenvalues are not quadratically 


§17 HERMITIAN OPERATORS 65 


integrable. If to each eigenvalue of the operator there belongs one and only 
one eigenfunction Y, the spectrum is said to be non-degenerate. On the 
contrary, if to one eigenvalue F there correspond several, for example s, 
different eigenfunctions, then the given eigenvalue is said to be degenerate 
with degeneracy s. 

The examples given above are important because they illuminate our inter- 
est in the theory of operators. The problem of finding the solutions of the 
Schrödinger equation is a particular case of the problem of the eigenfunctions 
of operators of a particular form. 

Before passing over from this heuristic reasoning to the establishment of a 
more complete relationship between the concepts of quantum mechanics and 
the theory of linear operators, it is necessary in addition to consider impor- 
tant properties of a particular class of operators.~ 


§ 17. Hermitian operators 


The eigenvalues F in the operator equation (16.1) can, generally speaking, 
be. complex. However, we shall be solely interested in equations which lead 
only to real eigenvalues. It turns out that there is a class of operators which 
can possess only real eigenvalues. Such operators are called Hermitian or self- 
adjoint. Each linear operator F can be compared to a certain operator Ft 
which we shall call the adjoint operator or the Hermitian conjugate. The 
adjoint operator is defined by the condition 


[viFvoav= fuxFtyy'av. (17.1) 


Here, as always, the asterisk denotes complex-conjugate quantities. The inte- 
gration in (17.1) is carried out over the entire region of variation of the 
independent variables. We have denoted by dV a volume element of this 
region. 

The functions Y}, > must satisfy the necessary requirements for the 
convergence of the integrals in (17.1). Furthermore, they must satisfy certain 
boundary conditions which usually amount to the requirement that the func- 
tions Y} and y, reduce to zero at infinity. But in other respects the functions 
Yı and Wo are rather arbitrary. If the operator F coincides with its adjoint 
operator Ft =F, then such an operator is said to be Hermitian or self-adjoint. 
In this case relation (17.1) has the form 


JuiPuoa = fy P via. (17.2) 





66 THE MATHEMATICAL APPARATUS Ch. 3 


Here we have denoted by F* the operator defined by the relation F*y 
=(Fy)’. 

As an example let us find the adjoint operator of the differential operator 
F= d/dx. Assuming that the functions Y}, W reduce to zero at infinity and 
carrying out the integration by parts in (17.1), we obtain 


Jugne- f V2 Vide. 


Comparing with (17.1), we find the operator F7: Fi =—d/dx. We see that 
the operator Fi in the given case does not coincide with the operator È, i.e. 
the differential operator is not self-adjoint. If, however, the operator i(d/dx) 
is taken as the operator F, then it is easily seen that such an operator is 
Hermitian. Indeed, in this case we have, on integrating by parts 


oo - œ 
h d 3 d 
if Viggvet=-i ngrin 


—00 


and relation (17.2) is now valid. Hence it follows that the operator F = i(d/dx) 


is Hermitian. 
We also define the operator Ê, which is called the transpose of the opera- 


tor Ê 
os ok 
[viFvoav= fufua. (17.3) 


Comparing (17.3) with (17.1), we obtain Fi= FR. 
Further we find the operator Li adjoint of the operator L which is the 


product of two operators L = FR. From the definition (17.1) we have 
JviFRuayav= [Ry Ft yar. 
We exchange the functions (R Y2) and (Ft Yı). Then we get 
JUERyaV = fiy) Rvar. 
Further, we again use the relation (17.1) 
[UIERnav = f v(RtEt yy ar. 
From this expression we obtain the operator Lt = (FR)* 
(FR)t=RIFT. (17.4) 
We see that the adjoint operator of the product is equal to the product of the 


§18 EIGENFUNCTIONS OF HERMITIAN OPERATORS 67 


adjoint operators taken, however, in the reverse order. Thus if the operators 
F and R are self-adjoint, i.e. Êt =F, Ri=R, then their product will be a 
self-adjoint operator only in the case where they commute. Indeed, under 
these conditions we have 


(FR)t =RF=FR. (17.5) 


Since each operator undoubtedly commutes with itself, it follows from Cy, 5) 
that if the operator F is Hermitian, then so also will be the operator Ê2 = FF 
as well as, in general, the operator Fr=F- Ff, where n is a positive 
integer. z 

We now pass on to the proof of the basic theorem on the reality of the 
eigenvalues of Hermitian operators. For this we once again write eq. (16.1), 
ae for concreteness that the operator F possesses a discrete spectrum 


Êy, = F„WYpn- Multiplying the equation from the left by yr and integrating, 
we est 
_ SURF a 
n Sipalay 


If the operator F is Hermitian, then it is easily seen that the eigenvalues F}, 


determined in (16.1), are real. Indeed, taking into account the Hermitian 
property (17.2), we find 


LU nF na 
" flW,l2aVv 


Thus we have proved that Hermitian (self-adjoint) operators have real eigen- 
values only. 


niz 


§18. The orthogonality and normalization of the eigenfunctions of Hermitian 
operators 


The eigenfunctions of a linear Hermitian operator È, which correspond to 
different eigenvalues F„ and F,, are mutually orthogonal, i.e. satisfy the 
relation 


[UinYndV=0  (form#n). (18.1) 
The functions Y, and y}, satisfy eq. (16.1), 
FY, = Jr Vn ? F* Yin = Em Wm 7 l ( 18.2) 





68 THE MATHEMATICAL APPARATUS Ch. 3 


Since the operator F is Hermitian, we have: 

[UnFundV = f YP hay. (18.3) 
Making use of eqs. (18.2), we rewrite eq. (18.3) in the form 

Fy J VinYndV = Fim | Yin Vn - 


Hence it follows that 


Fm —Fr) [Yin YndV = 0. (18.4) 
Since by assumption F,,, # &,,, then 
[¥indna¥ > 0, 3 (18.5) 


which proves our statement. 
Because the eigenfunctions satisfy a homogeneous linear equation, they 


are determined to within an arbitrary constant. 
Keeping in mind what is to follow, we shall normalize the eigenfunctions 


of a discrete spectrum by the condition 


JVavnav=1. . (18.6) 


The eigenfunctions satisfying relation (18.6) are said to be normalized to 
unity. We combine formulae (18.1) and (18.6) into one formula 


[Yn Wd = Orn > (18.7) 
where ô is the Kronecker symbol: 
1 if n=m, 
Pam = | 0 if n#m. 


We now consider the case of degenerate states, where several eigenfunc- 
tions Ypi VW), ---. Uys, belong to the same eigenvalue F. 

One can take as the solution of eq. (18.2) corresponding to the eigenvalue 
Fp arbitrary linear combinations of these functions 


s 
Vink = 2 akrYnr . (18.8) 
By appropriate choice of the coefficients a,, one can obtain mutual orthog- 


onality of the eigenfunctions Y, which belong to one and the same eigen- 
value F„. Imposing also the normalization condition, we obtain 


[Vie Wud = êp. (18.9) 


§19 EXPANSION IN TERMS OF EIGENFUNCTIONS 69 


The condition (18.9) still does not completely determine the values of the 
coefficients a;,. Indeed, if the functions Y„¢ are already orthogonal to each 
other and if we have carried out transformation (18.8), then the orthogonality 
will be preserved if 


sS 


akl = Op - (18.10) 
r=1 
Thus there is still a certain arbitrariness in the choice of the coefficients azy. 
Finally, we consider the wave functions of a continuous spectrum. For the 
wave functions of continuous spectrum Y p(x) the condition of orthogonality 
is proved analogously to (18.3)—(18.5): 


[Vrav @)dV=0. (18.11) 


On the other hand, the condition of normalization can no longer be written 
in the form of (18.6), because the wave functions of the continuous spectrum 
are not quadratically integrable. For these functions the integral fly play 
diverges. This divergence is associated with the fact that the eigenfunctions of 
the continuous spectrum do not reduce to zero at infinity. The eigenfunctions 
of the continuous spectrum are conveniently normalized to: the Dirac 6- 
function (see Vol. 1, Appendix III), since the conditions of orthogonality and 
normalization can be expressed analogously to (18.7), 


[UEC Up CAV = (R= F’). a 


The normalization to the 6-function, of course, is not the only possible one. 
Later we shall encounter other methods of normalizing the eigenfunctions of 
the continuous spectrum (see, for example, §26). 


§ 19. Expansion in terms of eigenfunctions 


In the preceding section we have proved that the system of eigenfunctions 
of an arbitrary linear self-adjoint operator is a system of orthogonal functions. 
It turns out that such a system of functions is complete. An arbitrary contin- 
uous function, determined in the same region of variation of the independent 
variables and satisfying a wide class of conditions, can be expanded in this set 
of eigenfunctions*. 

We shall first give here the conditions for the completeness of a system of 


* V.1.Smirnov, A course of higher mathematics (Pergamon Press, Oxford, 1964). 





70 THE MATHEMATICAL APPARATUS Ch. 3 


eigenfunctions for the case of an operator F which possesses a discrete 
spectrum. We write the expansion of a function y in a series in terms of the 
eigenfunctions W,,, assuming the latter to be normalized to unity, in the form 


V(x) = 2 CnYr&). (19.1) 


The amplitudes ¢„ can be determined by making use of the orthogonality of 
the eigenfunctions. Multiplying (19.1) by Vp) and integrating over all the 
regions of variation of the independent variables, we obtain 


SUVO =Z cn [UO Vy (DAV . 


Here we have changed the order of summation and integration. By virtue of 
the orthogonality of the eigenfunctions (18.7), of all terms of the sum on the 
right-hand side of the equation only the term with n =m is different from 
zero. Consequently we have 


em = [Vn VAN. (19.2) 
Substituting this expression into (19.1) and again changing the order of 


summation and integration, we obtain 


væ= fve) (2 Vile) Pal) AV" (19.3) 


n 


_ For this expression to hold for an arbitrary continuous function W(x) it is 
necessary that the equality 


2 WEN Vna) =œ’) (19.4) 


be fulfilled. The relation (19.4) expresses the condition for the completeness 
of a system of eigenfunctions Y,„(x). If the operator F possesses a continuous 
spectrum, then the expansion of the function W(x) in terms of its eigenfunc- 
tions will be no longer a sum but an integral: 


Y@)= fel) vp) dF . (19.5) 


The amplitudes c(F) are found in the same way as in the case of a discrete 
spectrum. Multiplying the left-hand and right-hand sides of eq. (19.5) by the 
function Ypa) and integrating over the entire region of variation of the 
independent variables, we find 


[VEO = fear fyo vp) AV. 


Assuming that the eigenfunctions Y p(x) are normalized to the 5-function, 


§20 QUANTUM-MECHANICAL VARIABLES AND OPERATORS 71 


we obtain finally 
CF) = [vvar . (19.6) 


We have already encountered in §3 a particular case of such an expansion 
(expansion in terms of plane waves). The condition for completeness in the 
case of a continuous spectrum is written analogously to (19.4) 


[UEC peo) =S ashy. (19.7) 


§20. Quantum-mechanical variables and operators 


We can now turn to the discussion of the basic postulate of quantum 
mechanics which establishes the connection between real physical quantities 
which characterize the properties of quantum-mechanical systems and the 
mathematical apparatus of quantum mechanics. j 

In classical mechanics the state of a system is determined by the whole set 
of coordinates and momenta (or variables expressed in terms of the latter) 
involved in the equations of motion. All variables characterizing the state of a 
system are called mechanical variables. In quantum mechanics, variables which 
play an analogous role will be called quantum-mechanical variables. They are 
also often said to be physical or dynamical variables. 

In the examples considered above certain properties of quantum systems 
have been elucidated. These are, in the first place: 

(1) the existence of an uncertainty relation between the values of canonically 
conjugate physical variables (such as, for example, the coordinate and momen- 
tum); 

(2) the existence of a discrete spectrum and a continuous spectrum of values 
of physical variables (for example, the energy of a quantum oscillator and of 
a free particle); 

(3) the existence of a superposition of quantum states (for example, the 
superposition of states of a free particle); 

(4) the continuous transition from the concepts of quantum mechanics to 
those of classical mechanics in passing to systems in which the Planck 
constant can be assumed -to be an infinitesimal quantity, while quantum 
numbers can be assumed to be infinitely large (the correspondence principle). 

The first and second of these properties just correspond to the properties 
of linear operators — their non-commutativity and the existence of a spectrum 
of eigenvalues. Hence it is natural to make the following basic assumption: 
‘To each quantum-mechanical variable F there corresponds a certain linear 





72 THE MATHEMATICAL APPARATUS Che3 


Hermitian operator F. The spectrum of eigenvalues of the operator Ê repre- 
sents the spectrum of the possible (measured) values of this variable’. 

The eigenfunction Y p(x) of the operator F represents the wave function 
of the system in the state in which the variable represented by the operator F 
has the given definite value F. 

The requirement of hermiticity of the operator is connected, obviously, 
with the reality of the values of real physical quantities, whereas the require- 
ment of linearity is associated with the principle of superposition. It is clear 
that this statement will assume a concrete meaning only after being supple- 
mented by the indication of how the operator corresponding to a given 
quantum-mechanical quantity can be found. If such a recipe were known, 
then the postulate formulated would make it possible to determine the 
spectrum of the possible values of this quantity. The validity of the basic 
postulate can be established only by the agreement between the inferences 
of quantum mechanics and experiment. 

For the determination of the form of the linear operators which corres- 
pond to definite quantum-mechanical variables — quantum-mechanical opera- 
tors — it is necessary to make use of the correspondence principle. Namely, 
it is natural to assume that between quantum-mechanical operators describing 
the motion of particles in quantum mechanics there are the same relations as 
between their ‘originals’, the variables of classical mechanics. Thus, for exam- 
ple, the total energy operator H is connected with the kinetic energy Operator 


T and the potential energy operator U by the relation 


H=T+0. (20.1) 
In its turn the operator T is equal to 

7p = PI? 20.2 

T am? (20.2) 


where p is the momentum operator, and so on. 
We have, in essence, already made use of these relations in the preceding 


chapter in obtaining the Schrodinger equation. If quantum-mechanical opera- 
tors are connected with each other by the ordinary relations of classical 
mechanics, then it is sufficient to obtain the expression for one operator in 
order to construct subsequently the total system of operators of quantum 
mechanics. The limiting transition to classical mechanics as A > 0 will auto- 
matically be ensured, provided the initial operator is correctly chosen, taking 
into account this condition. Such an approach appears to be quite reasonable, 
although not strict. In what follows another, more consistent method for the 


construction of operators will be presented. 


§21 PROBABILITY OF RESULTS OF MEASUREMENTS 73 


One can choose as initial operators the coordinate operator and the 
momentum operator. 

The coordinate operator f amounts to multiplication by this variable, as 
does every operator corresponding to an independent variable, i.e. 


N) 


odio ae PHY; 


=z. (20.3) 


To establish the form of the momentum operator p, use can be made of the 
fact that the free particle is described by the Schrödinger equation (6.5), 
n? 
———y2y=Ew. 
2m v 4 
On the other hand, by virtue of what was said above, this equation can be 
written as 


+ p24 52 =F 
5 (2+ Py tez) VEY. 


Hence it follows that the operators Py, Êy, Ô, can be chosen in the form 


a tt G) a 0 © 2 i a 

TAa © a D eD 
Thus the momentum component operator amounts to differentiation with 
respect to the corresponding coordinate. The factor i ensures the hermiticity 
of the operator p. Before considering the construction of operators which 
correspond to quantum-mechanical variables by a more consistent method 
we shall consider two questions of principle: the question of the meaning of 
the eigenfunctions of operators, and the question of the possibility of simul- 
taneous measurement of two quantum-mechanical quantities. 


§21.The wave function and the probability of the results of measurements 


Let F represent a certain quantum-mechanical operator for which one can 
write 


FY, = FnYn r 


For definiteness we assume that the operator F has a discrete spectrum of 
eigenvalues F„ and that to each of these there corresponds an eigenfunction 
Vn (the EE is non-degenerate). Since the eigenfunctions W, form a 
complete system of functions, the wave function y can be expanded in a 





74 THE MATHEMATICAL APPARATUS Ch. 3 


series 


Y= 2 enn. (21.1) 


On the basis of the principle of superposition we can conclude that the state 
of the system described by the wave function W can be written in the form of 
a superposition of the states with definite values F„ of the physical 
quantity F. 

The amplitude c, in the expansion (21.1) shows the weight with which 
the state YW, is represented in the state Y. In other words, the amplitude c,,, 
characterizes the probability that a value equal to F,,, will be found when 
measurements of the quantity F are carried out on the system in the state 
with wave function Y. In quantum mechanics it is assumed that this probabil- 
ity is equal to the square of the modulus of the amplitude in the expansion, 
[cm]? Thus, if we want to find the probability of finding the value F, for the 
physical quantity F when measurements are carried out on the system in the 
state Y, the wave function must be expanded in terms of the eigenfunctions 
of the operator Ê. The square of the modulus of the corresponding amplitude 
in the expansion, eles gives the probability sought. If the quantity F 
changes continuously (continuous spectrum), then one can speak of the prob- 
ability that in a measurement one will obtain a value of F lying in the interval 
between F and F+dF. The corresponding probability is given by the expres- 
sion 


dW = |c(F)I2dF . (21.2) 


Thus, in expanding Ww in terms of plane waves (see §3) the square of the 
modulus of the corresponding amplitude in the expansion gives the probabil- 
ity that in a measurement a certain given value of the momentum will be 


obtained. - 
The probabilities of the measurements of given values of the quantity F, 


which are defined in the way shown above, satisfy the relations 


D lep? =1, — fieryPar=1 (21.3) 


(under the condition that the wave function y is quadratically integrable, 
while the eigenfunctions of the operator F are normalized by the condition 


(18.7) of (18.12)). 
Let us, as an example, prove the last of these relations. Making use of 





§22 MEAN VALUES 75 


(19.5) and (19.6), we obtain 
fc") cl) aF = fear f vee) ve)aV = 


= Sowar fer) VE@) dF = fx) WAV = 1. (21.4) 


The whole set of amplitudes c, (or c(F) in the case of a continuous 
spectrum) determines the wave function Y completely. Hence the definition 
of the amplitudes in the expansion of the wave function in terms of the 
eigenfunctions of an arbitrary operator is equivalent to the definition of the 
wave function itself. 

In this connection the following terminology is often used. The wave 
function W(x) is said to be a wave function given in the coordinate represen- 
tation (x-representation); the whole set of all amplitudes c(F) is called the 
wave function in the /-representation. In this sense the relations (19.5) and 
(19.6) must be considered as completely symmetric. The relation (19.5) 
expresses the expansion of the wave function y, taken in the coordinate 
representation, in terms of the eigenfunctions Y p(x) of the operator F which 
is also taken in the x-representation. The amplitudes of the expansion c(F) 
represent the wave function in the F-representation. On the other hand, the 
relation (19.6) expresses the expansion of the wave function c(/), taken in 
the F-representation, in terms of the functions W(x) which have the meaning 
of the eigenfunctions of the coordinate operator taken in the F-representation 
(see (48.19)). The amplitudes of the expansion W(x) represent the wave func- 
tion in the x-representation. We shall say also that a certain operator D is 
given in the F-representation if it acts on a function given in the F-represen- 
tation, for example, Dc(F) = b(F). From this point of view the statement 
formulated above that |c(F)|2dF is equal to the probability of observing the 
system in the state with a given value of F becomes almost obvious. Indeed, 
|W(x)|2dx is the probability that the coordinate of the particle lies in the 
interval dx. In view of the equivalence of the x-representation and F- 
representation, |c(F)|2dF is naturally interpreted as the probability that a 
measurement of F will lead to a value which lies in the interval between F and 
F+dF. 


§22. Mean values 


_Let us assume that the state of a system is described by a wave function 
V(x) which is not an eigenfunction of the operator Ê corresponding to the 
quantum-mechanical quantity F. As we have already explained above, this 





76 THE MATHEMATICAL APPARATUS Chas: 


means that in the given state the quantity F has no definite value. In meas- 
urements carried out on the system one can obtain, with a certain probability, 
any eigenvalue F,,. In this connection it is natural to try to find the mean 
value of the quantity F in the given state. We understand the mean, as always, 
to be the mathematical expectation (the arithmetic mean) of the given quan- 
tity. 

Let us consider an ensemble, i.e. a large number of completely identical 
samples of a system. Each of these systems is described by one and the same 
wave function Y. We carry out the measurement of the quantity F in each of 
the systems. The mean value obtained from the entirety of these measure- 
ments will be called the mean value of the quantity F. According to the gener- 
al formulae of the theory of probability (see ch. 1 of Part IHI) we can write 


F=D) WF,» (22.1) 
n 


where W, is the probability of obtaining the eigenvalue F, in measuring the 
quantity F. Making use of the expressions for the probabilities W,, which we 
have obtained in the preceding section, we have for the case of the discrete 


spectrum 


F=2) \e,|?Fy (22.2) 
n 


or, if the operator Ê possesses a continuous spectrum, 
F= fic(FyPFar . (22:3) 


These formulae can be transformed in such a way that, instead of the 
amplitudes of the expansion of the wave function in terms of the eigenfunc- 
tions of the operator Ê, they contain directly the wave function W(x) (i.e. 
they can be transformed into the coordinate representation or any other 
representation). For concreteness we assume that the operator Ê possesses a 
discrete spectrum (in the case of a continuous spectrum the transformation 
formulae are derived in an analogous way). Making use of the expression 
(19.2) for the amplitudes, we obtain 


u= zu CnenEn = 24 CrP [Ya @)AV 
n 


Since the eigenfunctions w,,(x) satisfy the equation 


FY) = Fp Vne), 


§22 MEAN VALUES 77 


the last relation can be rewritten in the form 


F=} cy {VF UAV = favy'Ê D cyVy- 
n > n 


Taking into account (21.1), we obtain finally 
F= [y*Pyav. (22.4) 


We note that this expression must be written in a somewhat more general 
form, if the wave function 4 is not normalized to unity. In this case 
SUVARI 
jeg ds Ae (22.5) 
Sv yav 


If the wave function w is an eigenfunction of the operator F 
FY=FmY., 


then the quantity F has a definite value equal to the eigenvalue F,,,. In this 
case, as was to be expected, the mean value of the quantity F is the same as 
this eigenvalue, F= Jom 

The relation (22.4) can be the starting point in choosing the operator 
which corresponds to a given physical quantity. It then follows immediately 
that in the coordinate representation the coordinate operator amounts to 
multiplication by this coordinate. 

Indeed, proceeding from the physical meaning of the wave function, we 


can write the expression for the mean value of the coordinate in the form 
z= fiyxav= fy*xyav. (22.6) 


Comparing this expression with (22.4), we see that the coordinate operator £ 
is multiplication by the coordinate x. In an analogous way, if we have an 
arbitrary function of coordinates U(x, y, z), then its mean value is given by 
the expression 


Uw, y, 2) = | ivieu, y, z)dV = f y*Uyar. (22.7) 


It follows from this that the operator of an arbitrary function of coordinates, 
taken in the coordinate representation, is multiplication by the function. 
This, of course, corresponds to the statement we made earlier. In general the 
operator corresponding to a physical quantity F in its own F-representation is 
multiplication by the quantity F. This general statement is easily explained in 
the same way as ye have done in the example of the coordinate. The mean 
value of the quantity F, obtained by means of the function c(F), i.e. by 





re ee = M 


78 THE MATHEMATICAL APPARATUS Ch. 3 


means of the wave function in the F-representation, is given by the formula 
F= fIc(F)?FaF= fc°(F) Fe(F) AF . 


On the other hand, the general expression for the mean in terms of the opera- 
tor F, taken in the F-representation, must have the form 


F= jp c*(F) Fe(F) dF . 


Comparing these expressions, we see that the operator Ê in its own represen- 
tation is just multiplication by F. 


§23. Commutation of operators 


One of the most important problems arising in quantum mechanics is that 
of the possibility of the simultaneous measurement of values of physical 
quantities of a given quantum-mechanical system. 

In order that two quantities F and R may have sharp values in a state 
described by a wave function y,(x), it is obvious that this wave function 
must be an eigenfunction of both the operators F and R, i.e. that the follow- 
ing two equations must simultaneously be satisfied: 


FY, (x) = FU (x) , (23.1) 
Ryn) = RY A(X) . al 


We operate on the first equation with the operator R, and on the second with 
the operator F: 


RFY,, = RFY, = FRY, , 


aA 


FRY, = FRY, =RFY,,, 


The right-hand sides fi these equations are equal and, consequently, the left- 
hand sides are also equal, i.e. 


RFY,,=FRY,, 
or 
(RF-FR)y,,=0. a (23.2) 


If the general eigenfunctions Y, form a complete system of functions, 
then an arbitrary wave function y can be expanded in this system of func- 
tions. Operating on the function with the commutator RF — FR, we obtain, 


§23 COMMUTATION OF OPERATORS 79 


obviously, 

(RÊ- FR) = 2 e (RÊ- FR), =0. (23.3) 
The last equation can be written symbolically in the form 

RF—FR=0. (23.4) 


Thus we have proved that if two quantum-mechanical quantities can have 
sharp values simultaneously, then the corresponding operators must commute. 
Of course, if these quantities simultaneously have sharp values only in certain 
particular states (so that the common eigenfunctions ,, do not form a 
complete system of functions), then the corresponding operators do not 
commute (see, for example, $30). 

The converse of the theorem can also be proved: if two operators FandR 
commute, they will have common eigenfunctions. To prove this we operate 
on the equation for the eigenfunctions of the operator F with the operator R. 
We make use of the fact that the operators F and R commute with each 
other. Then we obtain 


FRY) = FRY) . 


We see that the function Y'= Rw is also an eigenfunction of the operator 
Ê corresponding to the eigenvalue F. If there is no degeneracy, then the 
function y’ describes the same state as the function W, and consequently, 
can differ from y only by the constant factor R, i.e. 


Ry=RŲų. 


Thus we have proved that the function y will simultaneously be an eigen- 
function of the operators Ê and R. The proof is easily generalized to the case 
where there is ; degeneracy. However, in this case not every eigenfunction of 
the operator Ê or R will simultaneously be an eigenfunction of both opera- 
tors. Nevertheless, for commuting operators one can always construct a 
complete system of common eigenfunctions. 

Summing up what has been said above, we can say that if to two quantum- 
mechanical quantities there correspond commuting operators, then these 
quantities will simultaneously have sharp values; but if the operators do not 
commute, then these quantities, generally speaking, cannot simultaneously 
have sharp values, with the exception of certain particular cases (see §30). 

Let us illustrate this by a concrete example. The coordinate operator ¥ and 
the operator corresponding to the momentum component, Py, can be chosen 
as an example of non-commuting operators. The corresponding quantities x 
and p,, as we know, do not simultaneously (i.e. in one and the same state) 





80 THE MATHEMATICAL APPARATUS Ch. 3 


have sharp values. On the contrary, the coordinate operator and a momentum 
component operator corresponding to an orthogonal coordinate, for example 
2 and Py, commute with each other. The corresponding quantities x and p,, 
are simultaneously measurable. The coordinate in a given direction and 
momentum components in a direction orthogonal to the given direction can 
simultaneously have sharp values. 

We can now define more precisely the concept ‘state of a system’ in 
quantum mechanics. A state of a system is given if the wave function describ- 
ing this system is given. However, in no circumstances can we measure direct- 
ly the wave function itself. Only the square of its modulus, interpreted as a 
probability, has a physical meaning. A way out of this apparent contradiction 
is «> iollows. When we say that a state of a system is given, then this means 
that the value of a definite set of quantum-mechanical quantities is given. 
This set of quantities, the definition of which completely determines the state 
of the system, is called a complete set of quantum-mechanical quantities. In 
classical physics, in order to define the state of a system at an arbitrary 
instant of time, we have to give the values of all the generalized momenta and 
generalized coordinates at that instant of time. If the classical system has n 
degrees of freedom, we have to give the values of 27 variables. For a micro- 
system, i.e. a system described by quantum mechanics, it is obvious that the 
complete set cannot include both the momenta and coordinates of the parti- 
cles, because these quantities have not simultaneously sharp values. To define 
the state of a system in quantum mechanics it is sufficient to give only the 
coordinates of the particle or only its momenta or, in general, any set of 
independent quantities which are simultaneously measurable and whose 
number is equal to the number of degrees of freedom of the system. Then the 
wave function describing the given state of the system will be an eigenfunc- 
tion of the operators of the quantities which enter into the complete set 
corresponding to the given eigenvalues. 

For example, if the system possesses three degrees of freedom, then the 
momentum components Py, Py, Pz can be chosen as the quantities which 
form the complete set. The corresponding wave function has the form (2.12). 

States characterized by a complete set of quantities defined at a given 
instant of time are said to be completely specified or ‘pure’ states. These 
States are unambiguously described by a corresponding wave function. At a 
given instant of time this wave function is chosen as the eigenfunction of the 
operators of all the quantities which enter into the complete set. We note 
that we shall also obtain a ‘pure’ state in the case where the wave function 
corresponding to this state is represented by a certain superposition of eigen- 
functions, for example by a superposition of plane waves (3.3). We obtain 


§23 COMMUTATION OF OPERATORS 81 


information on the development of the system in time by solving the 
Schrodinger equation (6.8) with given initial conditions, thus defining the 
wave function at subsequent instants of time. 

It should be noted that as well as ‘pure’ states one sometimes has to deal 
with so-called ‘mixed’ states (see §89). In these states the wave function of 
the system is not an eigenfunction. One can speak only of the probability P,, 
of the realization of one or other ‘pure’ state y,,. 

If we are interested in the probability of measuring the F,,,th value of the 
quantity F, then in the ‘pure’ state W(x) this probability is determined by the 
square of the modulus |c,,,|? of the corresponding amplitude in the expansion 
of the function y in terms of the eigenfunctions ¥,(x) of the operator Ê 


Wx) = z cY), 


2 * 2 
WF m) = lem? =| [vvar]. (23.5) 


If the system is in a ‘mixed’ state, then in order to obtain the required prob- 
ability we have to expand the wave function y,,(x) in terms of the functions 


Yx) 
Pu) = È crn HACE) 


where 


Ckn z Son) vi) dy. (23.6) 


The probability of observing the F,,,th value of the quantity F in the state 
Yp is given by the square of the modulus of the expansion amplitude c,,,,,. In 
its turn the state y,(x) is realized with a probability P,,. Thus, finally, accord- 
ing to the theorem of the multiplication of probabilities we obtain 


WE m) =È Pylemnl - (23.7) 


In order to compare the results obtained, we write the wave function y in 
the form of a superposition of the functions y,,(x) 


W(x) = 2 bnon) - 


Then, as is easily seen from (23.5) and (23.6), 


On D Oeon (23.8) 





82 THE MATHEMATICAL APPARATUS Ch. 3 


Substituting this value into the expression (23.5), we get 





2 1 
WCE m) -|2 bnlmn =D aalen y T22 BnbkEmnEmk > (23.9) 


where 


DADDAN 
n n 


We see that the expression obtained differs from the result given by 
formula (23.7) by the presence of the double sum which expresses interfer- 
ence between the states. In the case of ‘mixed’ states there is no such inter- 


ference. 
The above reasoning can be generalized directly to the case of a continuous 


spectrum. 


§24. Heisenberg inequalities 


We have in the preceding section found the conditions under which the 
simultaneous measurement of two physical quantities is possible. We now 
assume that two physical quantities F and R do not simultaneously have sharp 
values. Then the operators Ê and R corresponding to these quantities do not 
commute with each other. We assume that the following relation holds: 


FR-RF=iB., (24.1) 


where B is a certain Hermitian operator. 

It is of interest to determine in a general form the minimum possible value 
of the product of fluctuations of given quantities. We choose the mean 
square deviations (dispersions) AF’ 2 and AR2, where 

AF=F-F, 
AR=R-R, 


as the measure characterizing the deviations of the individual results of meas- 
uring the quantities and R from their mean values. Correspondingly, for the 
mean square deviations we have 


AF? = (FOF) =F2=F2.. 
AR?=(R—R)?=R?2-R?. 


Without affecting the generality of the argument we can set F=0 and 


(24.2) 


§24 HEISENBERG INEQUALITIES 83 


R=0 (in other words, we can understand F and R to be the deviation of 
these quantities from their mean value). 
Let us consider the integral 


J(a) = fief- iR)Wl2aV . (24.3) 


Here y is the wave function, the integration is carried out over the entire 
region of variation of the independent variables, and & is an arbitrary real 
parameter. The integral (24.3) is not negative: J(a) > 0. We rewrite it in the 
form 


J(a)= [(@P-iR)y-@F* +i) yr av. 


Making use of the self-conjugate property of the operators F and R, we 
obtain : 


J(a)= f vÊ + iR) aP- iĝ) vaV = 
= fv" @2F2—ai(FR - RÊ) + ÊD Yav . 


Taking into account (24.1) and using expression (22.4) for the mean value, 
we have 


J(&) = 02 F? + aB + R2 =&2AF? + aB + AR? . 


The condition for this trinomial quadratic in œ to be negative can be written 
in the form 


2 


4 AF2 AR? > B2 - (24.4) 


or 
(AF2AR2): > }|B| . (24.5) 


Formula (24.5) gives the relation which we have sought between the uncer- 
tainties AF and AR. It establishes the minimum possible value of the product 
of these errors. 

Let us consider a particular case, taking as the quantities F and R respec- 
tively p, and x. Then it follows from (20.3) and (20.4) that B = —h, and we 
have 


(Ap?) (Ax) > th . (24.6) 


Thus the uncertainty relation (24.5) is of a general character. The uncertainty 
relation for the coordinate and momentum is a particular case of the relation 
(24.5). 

Conjugate quantum-mechanical quantities cannot be measured simulta- 





84 THE MATHEMATICAL APPARATUS Chi 


neously. The minimum uncertainties in their values in a simultaneous meas- 
urement are connected with the quantity B. On the contrary, mutually 
commuting quantum-mechanical quantities, for which Ê = 0, can be measured 
simultaneously with an arbitrary degree of accuracy. 


§25. Poisson brackets 


In §20 we have considered one of the possible methods of finding the 
operators describing physical quantities. This problem was considered in a 
more consistent way by Dirac. He assumed that in quantum mechanics, as in 
classical mechanics, the concept of Poisson brackets can be introduced*. 
Thus, if to two classical quantities F, R there corresponds the Poisson bracket 


OF dR OF OR 
rae (EES) 
FRI“ \ ap, aq; 2a; dP; 


then to the operators Ê, R describing these quantities there corresponds the 
quantum Poisson bracket [F, R]. Further it was assumed that the properties 
of the quantum Poisson brackets correspond exactly to those of the classical 
Poisson brackets except that for the quantum brackets the consecutive order 
of the two factors is important. We write down the properties of the Poisson 
brackets: 


[Ê R] =- [R, Ê] , ; (25.1) 

[F, Cc] =0, (25.2) 
-where C is a constant. 

[F, + Fo, Â] = [F),R) +1721 , (25.3) 

[Ê, Ry + Ro) = [Ê Â;] + BR), ER) 

[F Ê>, R] = [Fy R] Ê; + F [Ê> Ê] , (25.5) 

[Ê, RyRo) = [Ê, Ri]; + R IF Rol ņ, (25.6) 


f [Fi [F2, F3]] + [F3, [F1 F2]] + [Fo [F3 Fı]] =0. (25.7) 
The choice of Poisson brackets as the basis for the construction of a 
* For a discussion of Poisson brackets in classical mechanics see L.D.Landau and 


E.M.Lifshitz, Mechanics (Pergamon Press, Oxford, 1960); H.Goldstein, Classical mecha- 
nics (Addison-Wesley, Cambridge, Mass., 1950). 


§25 POISSON BRACKETS 85 


system of quantum-mechanical operators is associated with the fact that, as 
we shall see, they can be expressed directly in terms of the commutators of 
the corresponding operators. The last combination of operators is the basis 
for their physical interpretation. 

Let us consider the Poisson bracket [F Â>, R R3], for the calculation of 
which we can use expressions (25.5) and (25.6). Correspondingly we obtain 


[Ê F>, R R2] = [Fy R R2] F> + Ê [Fz R Ro] = 
= [F), R1] R2Ê7 + R [Â Rol Po + Fy fn R Ro +Ê IR Ên, Ra] 
and 
[Fi F, Ry Ro] = [F Fn, R1] R3 + R [f F> Ro) = 
= Ê [Â>, R1] R> + [Ê], Ry FR + RÊ [Fa Rol +R [Â RÊ. 
Equating these two results, we find 
(FIR —RyF) (Fo, Rol = (Fy, Ri) (FR -R32 . 
Since the last equality must be satisfied identically, we have 
[Fn R1] =iC(\R,-RiF)), 
[Êz R2] = iC(FR2 - RÔ), 
where C is a real constant. 
The reality of C follows from the fact that the Poisson bracket of two real 


variables must also be real. Thus, if Ft = Ê, Rt=R, then also [Â R] t = 
= [F, R]. However, 


[F, R] t = —iC*(FR — RF)t = —iC*(RF — FR) = C*C-1[F, R] 


and hence C* =C. It follows from the classical theory of Poisson brackets 
that the constant C has the dimensionality erg—!-sec~!. Its numerical value 
can be determined only by comparing the inferences of the theory with exper- 
imental data. It turns out to be equal to A-1. Finally we have 


(FR — RF) =4 (FR). (25.8) 


In the transition to classical mechanics (see ch. 5), i.e. for A > 0, the 
commutator {FR} reduces to zero, as was to be expected. It is natural to 
assume if only in the simplest cases, that the Poisson brackets themselves have 
the same values as classical brackets. For canonically conjugated variable 








86 THE MATHEMATICAL APPARATUS Ch. 3 


coordinates and momenta in classical mechanics we have: 
[Pi Pk] =0, 
[xx] =0, (25.9) 
Pi xkl] =5x (i k=1,2,3). 
Here and in what follows we make use of the notation 
x=, P\=Px > 
X277, P2=Py, 
x3=Z, P3 =Pz. 


The same expressions can be written for the quantum operators of the coor- 
dinate and of the momentum components. Hence the commutators of the 
corresponding values assume the form 


& Xp —X,X;=0, 

PiPk— Êk: = 9, (25.10) 
ATA a. It t 

Dife- kpi =T OK (i k=1,2,3). 


We shall make use of these equalities for the determination of the coordinate 
operator and the momentum operator. 


§26. Coordinate and momentum operators and eigenfunctions 


We begin with the establishment of the form of operators in the coordinate 
representation*. In this representation the wave function characterizing the 
state of the particle depends on its coordinates W(x, y, z). The coordinates 
x,y,z are independent variables. Hence the corresponding operators, in 

_ correspondence with the conclusions of §20 and §22, reduce to multiplica- 
tion by these coordinates 


Zem (= 152, 3) . (26.1) 


* See V.A.Fok, Nachala kvantovoi mekhaniki (Principles of quantum mechanics) 
(KUBUCH, 1932) p..32. 


§ 26 COORDINATE AND MOMENTUM OPERATORS 87 


We rewrite the commutation relations (25.10) in the form 


Grrr- xr Y = FY k=1,2,3), (26.2) 
Bix- xô) YV =0 G#k), (26.3) 
Cik- ÊrPDYV=0 (i K=1,2,3). (26.4) 


Equations (26.2)—(26.4) can be satisfied by an arbitrary function y, setting 


h 3 dalx i, X2, X3) 


Keaigaxy m Po G65) 





Y> 


where &(x4, X2, X3) is an arbitrary real function. The reality of a is required 
for the hermiticity of the operator py. The function a can, without loss of 
generality of the result, be set equal to zero. Indeed, the action of the opera- 
tor (26.5) on an arbitrary function y transforms it into the function 

y' = h oy + da G 
i Oxy, OX, 
On the other hand, if we act on the function e@/™«y with the operator 
(h/i) (8/ax,), we then obtain the function eDay". Consequently, the transi- 
tion from the operator (26.5) to the operator (ħ/i) (0/dxx) is equivalent to 
the transition from the wave function y to the function e@™ay 


y > eliMay , 





2 O nA m G i u oO a Oe i 
— = eli/fila | = "+ = } e-h) 
i Ox, Ox, i OX, 3 (; ate) : a CSE) 


since 


Gha (h 2 4 9%) GimaciiMay = elay’ =" 9 (elia 
E (: ax, Oxy £ $ vec % i ax, € w) 

Wave functions, and operators, connected with each other by the trans- 
formations (26.6) have identical physical properties. This will be proved in its 
most general form in §46. 

The operators (26.5) and p, = (ħ/i) (0/0x,) (k =x, y, z) have the same 
spectrum of eigenvalues. Hence without loss of generality we can use, instead 
of the operators (26.5), the operators for the momentum components which 
have, in the coordinate representation, the form (see §20) 

a_i > A ta © A uL O) 


Mera OR Aoz (26.7) 





88 THE MATHEMATICAL APPARATUS Ch. 3 


or in vector form 


By (26.8) 


Pine 


where V is the gradient operator. 

Let us now use the momentum representation, in which the wave function 
depends on three momentum components: p,,Py,Pz- The corresponding 
operators reduce to multiplication by the quantities Px Py» Pz: The coordi- 
nate operators in this representation are found on the basis of the same 
commutation relations, and turn out to be equal to 


a 





Zz a R a 5 
=e =ii=—; z=ih k 26.9 
Š òP y y apy op, ( ) 
or 
4) f/f, Ome ð )=i ð 
= ——+j —— +k] Z7. 
i a (izz Jap, ` ap, ap 


Making use of (26.7) it is easy to establish the commutation relations of 
the operator p and an arbitrary function U(x, y, z) 


pU-Up=2 VU. (26.10) 


The commutation of the operator f with an arbitrary function Ix, Py, Pz) is 
calculated in an analogous way 


f- f= ih. (26.11) 


The equations for the eigenfunctions and eigenvalues of the operators 
Êx» Py, Bz are of the form 


h px _ a Vey _ ines 
MCL emcee he az e OCSID 


We write down the solution of the first equation 
Vp, = ay, z) eGlhpxx > 


where a(y, z) is an arbitrary function. Analogous solutions are also obtained 
for the functions Ypy and Wp,. The functions Yp, Ypy Wp, satisfy the neces- 
sary requirements, in particular the condition of finiteness (see § 16) for real 
values of P}, Py, Pz Thus the momentum operator has a continuous spectrum 


§ 26 COORDINATE AND MOMENTUM OPERATORS 89 


of eigenvalues. The wave function 
Vp = Ae (p-n) . (26.13) 


where A is a constant, is an eigenfunction of the operators Êy, Êy, Ô, and 
describes the state with given momentum p. A freely moving particle can be 
in such a state. This conclusion is in complete agreement with the result of 
§2. 

The constant A is defined by the normalization condition. Since the 
momentum operator has a continuous spectrum, its eigenfunctions are conve- 
niently normalized to the 6-function. Let us first find the normalization 
coefficient A in the case of one-dimensional motion. 

Setting SVpx Yp dx = 6(py- Ps) and taking into account (III.5), we obtain 
A = (2nhi)~2, so that finally 


py = (Qn)? WMP | (26.14) 


In the three-dimensional case, for the wave function (26.13) we have corres- 
pondingly 


Vp = (2rfi) 3 eU PD | (26.15) 


Another method of normalizing plane waves, called normalization in a 
‘box’, sometimes turns out to be more convenient. We define the wave func- 
tion in an arbitrarily large but finite volume V. As the normalization volume 
we choose a cube with edge Z and centre at the origin. We require that at the 
walls of the cube the wave functions (26.12) satisfy the condition of period- 
icity, ie. that at corresponding points of opposite faces the wave functions 
take on the same values. Under these conditions the momentum vector no 
longer changes continuously, but runs over a discrete set of values 


2h 2nh 27h 
Dy =F My 5 Py = "y> P2= "z> (26.16) 
where 1,.,M,,,/, are positive or negative integers, including zero. Choosing the 
edge of the cube, L, to be sufficiently large, the spacing between neighbour- 
ing eigenvalues of the momentum vector can be made as small as one pleases. 
The normalization coefficient defined by the condition 


141? f e@™P-2ay = 1 
it? i 


is equal to A = L~3. Correspondingly the wave function for such a normali- 





90 THE MATHEMATICAL APPARATUS Ch, 3 


zation has the form 
Wp = Lenn = pz eip- | (26.17) 
In §12 and 13 we normalized wave functions of the form (26.13), defining 


the probability current density jp. As a matter of fact, in this state, according 
to (7.6), 


RS ps2 
jo= APE. (26.18) 


Setting, for example, A = 1, we obtain 


j= =v, (26.19) 


i.e. for such a normalization the probability current density is numerically 
equal to the velocity of the particle. But if A = v-2, then this corresponds to 
a normalization to unit probability current density. 

It is easily seen that the operators Êy, Êy, Êz are related in a simple way to 
the operators of an infinitesimal translation along the x-, y- and z-axes respec- 
tively. Indeed, let us shift our system or, what is equivalent, the origin, a 
distance Ax along the x-axis. Then the old and new coordinates are connected 


by the relation 
x =x— âx, 
=e 
z'=z. 


We express the function w(x, y, z) in terms of the new coordinates x’, y’, z’. 
Confining ourselves to the first term of a series expansion we obtain 


PENZE + Ax, y',2')= 009" 2458 ax = 


= OL 1 n ft 
= (1+ax2s) vey). 


It is natural to call the operator 1+Ax(0/dx’) the operator of displacement 
by a distance Ax along the x-axis. We denote this operator by Ry’, so that 
Vx, y, Z) =Ry WX’, yz’). (26.20) 


We see that the displacement operator R, is connected with the operator of 


§26 COORDINATE AND MOMENTUM OPERATORS 91 
the corresponding momentum component p, 


> i 
= —Axp 26.2 
Renal t Axp, - (26.21) 
The form of the momentum operator p, could also be obtained proceeding 
from the expression for the operator Ry. * 

We write down the equation for the eigenfunctions and eigenvalues of the 
coordinate operator in the coordinate representation 


EVE) = xo Yke). (26.22) 


Here xọ is a particular value of the coordinate x. The operator X in its own 
representation reduces to multiplication by x. From eq. (26.22) it follows 
that 


Vg) =O OLE at gyre 


Furthermore, the functions Vx) must satisfy the orthogonality and normal- 
ization conditions 


ff VE) Vrp) dx = ôo = x0) - 


From these relations it follows that the function Vg) is of the form (see 
Appendix III) 


Vg) =5(x—-X9). (26.23) 


The eigenfunctions of the operators Y and Ê are written in an analogous way. 
Since the coordinate operators £, P, 2 commute with each other, their values 
are simultaneously measurable. Correspondingly, if the system has three 
degrees of freedom, the three coordinate projections x, y, z can be chosen as 
quantities forming a complete set. The wave function describing the state 
with three defined coordinates x9, Yọ, Zo is of the form 


Volt) = 5(x —x9)5( — 9) 5(Z — Zo) = ê(r — rọ) - (26.24) 


The eigenfunction of the momentum operator in the momentum representa- 
tion is written in an analogous way. 


* See L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965). An analogous result also applies to the operators of the y- and z-momentum 
components. 





92 THE MATHEMATICAL APPARATUS Tenia 
§27. The Hamiltonian operator 


The most important operator of quantum mechanics is the total energy 
operator H. As in classical mechanics, it is made up of the kinetic energy 
operator and the potential energy operator. We construct, first of all, the 
operator of the kinetic energy of a particle. In the non-relativistic approxi- 
mation, in which we are now interested, the kinetic energy is connected with 
the momentum of the particle by the usual relation 


O RN 


2m 2m (27.1) 


Replacing the momentum p of the particle in this relation by the operator p, 
we obtain the operator T which we shall call the kinetic energy operator (see 


also §20): 
1 h? 
Di e i oy) ae ets v7} 
zm (Px * Py + Bz)= TAM © (27.2) 


It is obvious that the kinetic energy operator commutes with the momentum 


operator. ‘ À 
We now consider the total energy operator H. Since the potential energy 


depends only on the coordinates x, y,z, the corresponding operator in the 
coordinate representation is simply the function U(x, y, z). Consequently we 
have 


Â=- y2+ Ux z) (27.3) 
om »y, o a 


Since the total energy operator in formula (27.3) is expressed in terms of the 
momentum operator (but not the velocity operator), it represents the 
quantum-mechanical Hamiltonian operator which is usually just called the 
Hamiltonian. The expression for the Hamiltonian can easily be generalized to 
the case where the particle moves in non-stationary external fields. Then 


a= y+ 00,2) (27.4 
2m , ’ - . ) 


where U(r, t) is the so-called force function, connected with the force which 
acts on the particle by the relation 
SNL, 


The formulae found for the Hamiltonian operator are inapplicable in the case 


§27 THE HAMILTONIAN OPERATOR 93 


of the motion of a particle in a field of force which depends on its velocity. 
An example of such a case is the motion of a charged particle in a magnetic 
field. 

To obtain the Hamiltonian operator in this case we make use of general 
rules. We write down the Hamiltonian function of classical mechanics for 
particles moving in an electromagnetic field. According to (41.4) of Part I, we 
have 


2 

H=3,,(p-<A) t+ey, (27.5) 
where the vector p is the generalized momentum of the particle, A and y are 
the vector and scalar potentials, and e is the charge of the particle. According 
to the general rule, we replace the Hamiltonian function in formula (27.5) by 
the Hamiltonian operator, and the generalized momentum by the momentum 
operator. The vector potential and scalar potential, which depend only on the 
coordinates and time, can be left unchanged, since in the coordinate represen- 
tation the application of the corresponding operators amounts to multiplica- 
tion by these functions. We then find 


a1 eo) 2 A 
Me (3 z ) +eg. (27.6) 


By means of the Hamiltonian operator found above the basic equation of 
quantum mechanics, the Schrödinger equation, can be written in the form 


ia = Ay . (27.7) 


The operator form of notation of the Schrodinger equation has-a very 
general character and is suitable for the description of the motion of a particle 
in arbitrary stationary or non-stationary fields. In particular, in such a form it 
is valid for the case of the motion of a particle in an electromagnetic field. 
Just as for the classical Hamiltonian function, the Hamiltonian can be trans- 
formed to an arbitrary curvilinear system of coordinates. For this one need 
only transform the differential Laplacian operator V2 to this system. Depend- 
ing on the symmetry of the force field it is convenient to choose a system of 
curvilinear coordinates in which the expression for the potential energy of the 
particle assumes the simplest form. In particular, as we shall see in §35, it is 
often convenient to write the Hamiltonian operator in spherical coordinates. 

The Hamiltonian operator of a system of particles can be constructed 
according to the same scheme which has already been successfully applied to 
the case of one particle. Namely, one has to write the classical expression for 





94 THE MATHEMATICAL APPARATUS Ch. 3 


the Hamiltonian function and then replace all quantities involved in it by 
their quantum-mechanical operators. 
The classical expression for the Hamiltonian of a system of N particles has 


the form 


Ne oy N 
D D 
= — 4+ DF U: 2 
H EA 2m, A Uz(rg) Unt 4 (27.8) 
where pg, mę and U;(r,) are respectively the momentum, mass and potential 
energy of the kth particle in the external field; U;,; is the potential energy of 


interaction of the particles. 

We obtain the Hamiltonian operator if we replace the momenta of the 
particles by the corresponding operators Pg, where the index k denotes differ- 
entiation with respect to the coordinates of the kth particle. After this 
replacement we obtain the Schrodinger equation. It has the form 


N N 
i oy = Fy y = ») — h? g2 D> Š 
t Hw > H at 2m, Kae Ux (ty) + Vint - (27.9) 


The expression for the Hamiltonian operator for a system of charged particles 
in an external electromagnetic field is generalized directly from (27.6). 


§28. Stationary states 


We assume that the Hamiltonian of the system does not depend explicitly 
on the time. In this case it is possible to separate the variables in the 
Schrödinger equation (27.7). We have already made use of this in §6. How- 
ever, we can now analyze more profoundly the solution obtained. 

We seek the solution of the Schrédinger equation (27.7) in the form 


Wx, =x) ve), (28.1) 


where we understand x to be the entire set of coordinates on which the wave 
function depends. 


Substituting this expression into (27.7), we obtain 


in XO ya) =x Ve). 


Separating variables in this equation gives: 


§28 STATIONARY STATES 95 


The expression on the left-hand side of the equation can depend only on the 
time t, while the expression on the right-hand side can depend only on the 
coordinates of the system. It follows from the equality of these expressions 
that each of them is equal to one-and the same constant which we shall 
denote by Æ. We then obtain 


x()= Ce“ WMET, y) =EY&), 


where C is an arbitrary constant. 

We see that the constant Æ has the meaning of an eigenvalue of the opera- 
tor H, i.e. determines the possible values of the energy of the system, and the 
function W(x) describes a state with given energy. 

The Hamiltonian operator can possess a discrete as well as a continuous 
spectrum, as we have seen in the examples already discussed. One also often 
encounters a mixed spectrum, i.e. a discrete spectrum in one energy interval, 
and a continuous spectrum in another. 

Assuming for definiteness that the operator Ê possesses a discrete spec- 
trum, we write the wave functions (28.1) 


Te OE TO WM Ent (28.2) 


The states of a system described by wave functions of the type (28.2) are 
said to be stationary. The wave functions of stationary states depend harmon- 
ically on the time with frequencies w,,=£,,/h. As we have already noted in 
§6, in a stationary state the density of the probability of finding a particle at 
a given point of space does not depend on the time. Indeed, 


W,(x, t)=1,(x, DI. 
Substituting the expression (28.2) for the wave function, we find 
W,(x, t) = |W,,(x)l2 = W,,(x, 0) . (28.3) 


This statement can easily be generalized. The probability of observing the 
eigenvalue F} in the stationary state y,,(x, £) does not depend on the time. 
According to our general rules (see §21), in order to obtain the probability to 
be determined we must expand the wave function W,,(x, t) in terms of the 
eigenfunctions Wp of the operator F and take the square of the modulus of 





96 THE MATHEMATICAL APPARATUS Ch. 3 


the corresponding expansion amplitude c. According to formula (19.2) 

cpt) = f Vale, VEX) AV = MEn fy (x) Vig(x) dV 
The corresponding probability W(F;,, t) is equal to 

Wy, t) = leg)? = | [Yn vE av] = =W(Fy, 0). (28.4) 


An arbitrary solution of the Schrödinger equation W(x, £t) can be expanded 
in terms of the wave functions (28.2). This function W(x, t) describes a state 
in which the energy of the system has no sharp value. 


§29. Integral form of the Schrödinger equation 


The Schrödinger differential equation can also be represented by an inte- 
gral equation. In a number of cases this latter formulation has a number of 
advantages both from the theoretical standpoint and from the point of view 
of purely mathematical convenience. The theoretical advantage of the integral 
representation of the equations of quantum mechanics is closely associated 
with the development, of Feynman’s ideas in quantum field theory (see 
ch. 14). 

In §58 we shall dwell in detail on the advantages of approximate methods 
of solving the Schrödinger equation in integral form. 

Let us consider a particle with Hamiltonian H depending, in general, on 
the time. At the initial instant of time let the wave function 


Yo = Y(t t1) (29.1) 
be defined. The wave function satisfies the Schrodinger equation 


in OY = a ar: ; 
in = Ay. (29.2) 


The wave function of the particle satisfying eq. (29.2) for the boundary 
condition (29.1) at the instant of time fj >, can be written in the form 


Vaz t) = [KO nin.) vO. tdr (29.3) 
The function K(rp, t2; r}, t4) is the Green’s function of eq. (29.2) (see §19). 
The interpretation of formula (29.3) is obvious: the Green’s function repre- 
sents the amplitude for the transition of the particle from the initial state 


with wave function Y(r;,f}) into the state with wave function (rp, t2), 
where fy > t}. 


§29 INTEGRAL FORM OF THE SCHRODINGER EQUATION 97 
Since formula (29.3) defines the wave function only for t> tį, the 
Green’s function can be predetermined by the requirement 
K(r2, t2;rj;t1)=0 for <t. 


For the relation (29.3) to be equivalent to (29.2) and (29.1), the Green’s 
function must satisfy the equation 


E os —A(r>, 1») | K(rp, t2; r1, t1) = hô (tg — 11) 5(0Q— t1). (29.4) 


Indeed, for fy > t} the Green’s function satisfies the equation 
pyle A 
Ẹ iy Hes) | K(r2, f934,0,)=0. (29.5) 


Operating on both sides of (29.3) with the operator ih(0/dt>)—H for fy >t), 
and taking into account eq. (29.5), we arrive at an identity. It is easily seen 
that the Green’s function defined by eq. (29.4) satisfies the initial condition 
(29.1) for ty = ¢), if (29.4) is integrated over the infinitesimal interval 2Ar > 0 
about the instant z}. We then have 


ttt A 
[in 2 Ate, | K(r2, t2; r1, t1) dtg = 
ty—At 2 
b titat 
=ihô(a-r) f 8(ty— ty) dty = ih5 =r). 
t,-At 
Evidently we have 
tiı+At 
im f Ha, t)Kl tzr t)dn=0, 
At>0 tat 
ttt 
lim ij hK, t2; r], t1) dt = AK (rg, t151, f1) - 
E N t2 $ 
Hence F 
K(r9, 04341, £1) = 62-1) 5 F (29.6) 


Thus if the Green’s function satisfies eq. (29.4), then (29.3) represents the 
solution of the Schrödinger equation: with the corresponding initial conditions 
(the Cauchy problem). 








98 THE MATHEMATICAL APPARATUS Ch. 3 


In other words, if the transition amplitude is known, then the wave func- 
tion is also known. On the other hand, the transition amplitude has certain 
important features which make it in many respects more convenient, and as 
completely characteristic of the system as the wave function. 

Let us first consider the case where the Hamiltonian A of the system does 
not depend explicitly on the time. Then one can find the general relation 
between the wave functions of the stationary states W,(r, £), the energy levels 
of the system, and the transition amplitude. We shall denote the latter by Kg 
for a system with Hamiltonian Ho. The amplitude must satisfy the equation 


[n a -Aoa | Ko(2, 1) = iS (ry —r1)ê(t2— ty) . (29.7) 

If Ao does not depend explicitly on the time, then the wave function satis- 
fying eq. (29.2) can be written in the form 
Vn = UnC) exp [—G/A)E,,t] , 


where the w,, form a complete set of orthonormal functions. By virtue of this 
property of the w,,, one can always write the expansion 


Ko(2, 1) =Ko(r, t2; Tj, tı) = DE A ty) Yni ty) O(ty tı) ; (29.8) 
where O(t — t4) is the step function 
1 (x>0) 
Ox) = (29.9) 
0 (*<0). 
The behaviour of Kg for tọ <t} is taken into account by means of the 
9-function. Substituting the expansions (29.8) into (29.4), we find 
[n 3 ~figtr)| DG, ee? ty) O(tg — ty) = 
oe a @ A 
=in D7 C, n Yn DC nê (t2- ty) [> a ote) | Yn 
=ih » ona ty) . 
On the other hand using (29.7), 


[ni-a] c Wy 9 (ty — ty) = h82- 11) 8 (ty — t1). 


Hence XC, Yp = ô(r2—r;)- But for y,, the normalization condition holds: 


DAY lei) = 50 =r). 


§29 INTEGRAL FORM OF THE SCHRODINGER EQUATION 99 


Hence for the coefficient C,, we find C, = VW}, and, finally, we obtain 
Ko(2, 1) = @(t2— t1) 2 U„(r2) Un(r1) exp [- (i/h) E,(ty — t1)] - (29.10) 


The summation goes over into integration for a continuous spectrum. As an 
example, let us find the explicit expression for the Green’s function of a free 
particle. In this case 


. 2i 
y= (2ah)-3 exp(ip -r) exp cE p) 5 


so that 


ipta- t}) 
a5) ap 


 Ko(2, 1)=8(t2— t1) (27h)? exp [—ip (rı —r2)] exp (- 2mħ 





(ta t1) imh|ry —r;|? 
= 3 exp( ). 
[27ih(ty — t,)/m]2 2(t2— t1) 


The transition amplitude Ko(r2, t2; r1, t1), or more briefly Ko(2, 1), has the 
following important properties: 5 

(1) The transition amplitude depends only on the difference ty — ty, as is seen 
from formula (29.10): Ko(ro, t2; r1, t1) =Ko(ro, rji; f2 — t). 

(2) Owing to this the transition amplitude K9(r3, t3; r1, £1) = Ko(r3, r1; t3— t1) 
can be written in the form 


Ko(03, r1; t3— t1) = f Ko(t3, T2; t3 — t2) Kolta, T1; t2 — ty) drz. (29.11) 


This means that the transition can be considered as a set of successive transi- 
tions (1 > 2), (2 > 3) for all possible states 2. 

Formula (29.11) expresses the principle of superposition. Its proof is ele- 
mentary: 


(13, t3) = J Kolt3, 195 t3- t) V(r, t2) dry = 
z SS Kols 12; t3- td Koltz t1; t2- t1) Wy, ty) arya, 
Wez t3) = | Koltz, ri; t3 = t1) Wy, 4) dry. 


Comparison of these formulae gives (29.11) immediately. 
(3) The Fourier component of the transition amplitude Kg defines the 
spectrum of eigenvalues of the energy of the system. Let us find the Fourier 





100 THE MATHEMATICAL APPARATUS Chea 


component of the function Ko(r2, r1; t2— t1) 
Ko(to, r1; ©) = f Koz rji fl —%4) exp [iw(ty — ti)] d2- t1) = 
= DU, (ry) Us(r}) if exp[—(ift) E,,(t2 — t1)] exp [ico(t2 — t1)] X 
- X O(tg— 1) d(tg— t1) = 
= DOU, (ra) UK), 
where 
T= fola- t) exp- G) Enlt- 1) exp liwt — t) d2 -= 11) - 


The -function can be written in the form of a contour integral 





: 1 ex i 
= oe 2 
6(x) = lim Jri f i da . (29.12) 


The contour of integration tor x > 0 is shown in fig. V.8. 
Fig. V.8 


Then for / we find 
1 da 








age AIET fexel- ilon- © or] dr = 
_ a, Il fee oe i 
= lim eS ay a a)= lim Sy C913 


Hence for Ko(r2, r1; œ) we obtain 


E * 1 
Ko(r2, r1; w) pim 2 U, 2) wedo E iy o (ODC) 


We see that the poles of the Fourier component of the transition amplitude, 
w = E„/ħ, correspond to the energy eigenvalues. Thus knowing the transition 
amplitude Kg one can find directly the spectrum of energies £. 

Let us turn to the general case of a Hamiltonian depending on time. It can 
usually be written in the form of a sum H = H((r) + U(r, t), where’ H.does” 


§30 THE ANGULAR MOMENTUM OPERATORS 101 


not depend on time. U(r, £) often represents a variable external field acting on 
the particle. In this case the Green’s function K satisfies the equation 


[haz Mo UKE, D= mems) (29.15) 
2 
and reduces to zero for fy < t}, 

EE LOK ey tii. (29.16) 


The differential equation (29.15) for the Green’s function can be compared 
to the integral equation 


KQ, 1)=Ko2, 1) -$ f Ko2,3)UB)KG, atx, (29.17) 


where d4x = dxdydzdr. In this integral equation Ko(2, 1) is considered to be 
a known function, and K9(2, 3) U(3) is its kernel. This can easily be seen by 
acting on eq. (29.17) with the operator [it(8/8r>) —A(r>)] . Then, taking 
into account (29.7), we obtain 


[» yoo Fiat) | K(2, 1) = if (ty — 11) 8 (tp — r1) + U2) K(2, 1). 
2 


Thus we again arrive at eq. (29.15). 

The initial condition (29.16) is contained in (29.17) since Kg(2, 1) = 0 for 
ty < tı. The integral form of the equation for the transition amplitude (29.7) 
is especially convenient because it allows one to obtain K(2, 1) in the form of 
a series of successive approximations (see §58). 


§30. The eigenvalues and eigenfunctions of the angular momentum operator 
and of the operator of the square of the angular momentum 


Let us now form some operators which play important roles in our subse- 
quent considerations — the operators of the angular momentum components 
and of the square of the angular momentum. Replacing mechanical quantities 
in the classical definition of angular momentum by quantum-mechanical oper- 
ators according to general rule, we find 


(30.1) 








102 THE MATHEMATICAL APPARATUS Ch.3 


We shall call the set of operators (bon iy and L the angular momentum operator 
L This quantity possesses all the properties of angular momentum. In partic- 
ular, as we shall show below, it obeys the same conservation laws as angular 


momentum in classical mechanics. 
Further, we construct the operator of the square of the angular momentum 


PoR iai (30.2) 


Let us consider the commutation relations for these operators. We first of 
all note that not all of the operators of the different angular momentum 
components commute with each other. Let us calculate, for example, the 
commutator îi,- ie Making use of expressions (30.1) we have 


A aryl (ee aE 2)- 
naar lz ay) (ax * az 
a a? a? a? a? ) 
E y ay: 
a pat 3zəx V a2  ayəx 7 dyaz 





On the other hand, interchanging the operators we find 


SA DTE DNE 
2 ee SS E 
aa (2 x2) (v3 72) 


a2 a2 MOE a a2 ) 
ery) Ee ee BOs 
v0 (> ðxðz 4 ðxðy Tei * 2 ay x? ðzðy/ ` 





Subtracting the lower equation from the upper, we obtain finally 


ii, 


—Î f =n? ( Be 2) =ni. (30.3) 


SS 
ke 
x 


Carrying out the cyclic permutation of the coordinates x, y, z we obtain two 
more relations: 

ESE ain, A E = inl, . (30.3’) 
From the relations (30.3) it follows that the components of the angular 
momentum of a particle /,, ly, l, cannot simultaneously have sharp values. An 
exception to this is the state when the angular momentum is equal to zero, 
so that then ik. =I, =i, =0. On the other hand the angular momentum 
component operators 7, thy and il. do commute with the operator of the 


§30 THE ANGULAR MOMENTUM OPERATORS 103 


square of the angular momentum Î?, i.e. the following relations hold: 


712 —Pi, = 
i,i2 17, =0, (30.4) 
Li? —-7, =0 


These relations are easily proved by means of (30.3). Let us prove, for exam- 
ple, the first of these. From relation (30.3), multiplying on the right and on 
the left by /,,, we have 


BE = Ly EF inl, 15: 
Subtracting the second relation from the first, we obtain 

A Bo eye Aa 

[2-21 =hGl, +10). 


Analogously, 


Le = (AE T nii, y W) a 


Also taking into account the fact that 2.72 — 727, 
ities obtained, we find 


RP-PZ=0. 


= 0 and adding up the equal- 


The two remaining relations (30.4) are proved in the same way. From these 
relations it follows that the square of the total angular momentum and one of 
its projections onto an arbitrary axis can simultaneously have definite values. 

We note that commutation relations analogous to (30.3), (30.3’) also hold 
for the angular momentum operator and the coordinate operator, and the 
angular momentum operator and momentum operator. Omitting the simple 
proof, we write down the two relations 


F sn (30.5) 


The remaining four relations are obtained by the cyclic permutation of the 
indices. Relations (30.3), (30.4) and (30.5) are the same as the corresponding 
classical expressions if, of course, we pass from commutators to the classical 
Poisson brackets. 

Further, let us determine the possible values of the angular momentum 
projection onto an arbitrarily chosen direction in space and the possible 
values of the square of the angular momentum (i.e. the eigenvalues of the 


= 


SS OO 





104 THE MATHEMATICAL APPARATUS Ch. 3 


corresponding operators). In solving the equations for the eigenfunctions and 
eigenvalues it is convenient to use spherical coordinates. 

We carry out the transition from the Cartesian coordinates x, y, z to the 
variables r, 3, y in formulae (30.1) and (30.2) according to the ordinary rules 
of replacement of variables. Omitting these elementary calculations, we 
simply given the result 





poi ae (30.6) 
î => ia p (sin Y a + cotan 3 cos y J (30.7) 
i= -4 (cosi - — cotan ding 5 ) A (30.8) 
Í =| Lė (sin 0 35) or A = ny2, (30.9) 


where v3 wy is the angular part of the Laplacian in spherical coordinates. 

Choosing an arbitrary direction in space as the z-axis, we define the eigen- 
functions and eigenvalues of the operator of the component of angular 
momentum in this direction. The equation for the eigenfunctions and eigen- 
values of the operator i is of the form 


h ow 

4 SSS iil 

i do Ly (30.10) 
The solution of this equation is 

y = y(r, 3) exp(il,o/n) (30.11) 
where W(r, 3) is an arbitrary function. 

The wave function which is the solution of eq. (30.10) must satisfy the 
condition of single-valuedness. Since y is a cyclic variable varying from 0 to 
27, the condition of single-valuedness is written in the form Wy) = Y(y + 27) 
or 

exp[(i/f) ly] = exp[(i/h) L(y + 27)] . 
This last condition is fulfilled if /, = mñ, where m is a positive or negative 


integer (including zero). In what follows it will be called the magnetic 


quantum number. 
Since the z-axis is not specified | by any physical condition, the same result 


also holds for the operators Í, and Be 


§30 THE ANGULAR MOMENTUM OPERATORS 105 


Thus the angular momentum component along an arbitrarily chosen direc- 
tion in space takes on integer (in units of #4) values. For a sharp value of the 
projection /, the two other projections have no well defined value. This means 
that if in a state with given Z, the values of the projections /, and /,, are meas- 
ured, then any possible value may be found for them. i 

The eigenfunction of the operator Î, depending on the angle y and normal- 
ized to unity by the condition 


2r 
Í Ym?) Um Ody = ô mnm > 
0 
has the form 
Wm) = (27) eine , (30.12) 


Let us now determine the eigenvalues and eigenfunctions of the operator 
of the square of the angular momentum, I? 


Pw=Pv. (30.13) 


Substituting into (30.13) the expression for 1? given by formula (30.9), we 
obtain the equation 


E fiz ee OW 1 02y 2 
a9 = he 
sind ðv (sin è ad ) ‘3 sin? 9 dy? j h2 yee (Ma) 








The equation for the eigenfunctions of the operator Î? is the well-known 
equation for the spherical harmonic functions*. 

Equation (30.14) only has solutions satisfying the standard conditions 
formulated in §16 for the values 12/2 =/(/ + 1), where / is a positive integer 
(including zero). The quantum number / is called the orbital angular momen- 
tum quantum number. Thus the operator of the square of the angular 
momentum has a discrete spectrum of eigenvalues 


=h? +1). (30.15) 
The solution of eq. (30.14) for the eigenfunctions of the operator of the 
* See, for example, V.I.Smirnov, A course of higher mathematics, Vol. III (Pergamon 


Press, Oxford, 1964) and V.A.Fok, Nachala kvantovoi mekhaniki (Principles of quantum 
mechanics) (KUBUCH, 1932) p. 118. 





106 THE MATHEMATICAL APPARATUS Ch. 3 


square of the angular momentum is of the form 
Vin, 9) = Yim. 9) = 


E DE DNE pir. 9) cime 
( DF ( empa ) Pi Coso)eime, (30.16) 
where m is an integer taking on the values m = 0, 1,+2,...4/;k =m for 
Gen ota on 0: 

We denoted by Pj” the associated Legendre a, 


al” peels 2i C ai” 
= PG) 2l (39) aS 


The constant factor in formula (30.16) is defined by the condition of normal- 
ization of the function Y, to unity. 





P" =(1— 2al ES. (30.17) 


x 2n 


S S Yin Yrm O, 9)sin 9.49 dy = 515 jy/- (80.18) 
0 0 


From formulae (30.15) and (30.16) it follows that to each eigenvalue of 
the square of the angular momentum there correspond 2/+1 eigenfunctions 
Ym (differing in the number 77). Thus the eigenvalues of the square of the 
angular momentum are degenerate. The meaning of this degeneracy, and 
consequently of the number 77, is easily understood. We act on the wave 
function Y}, with the operator /,. We then obtain 


LY n(0, 9) =i mO, 9) - (30.19) 


We see that the wave function Y}, is simultaneously an eigenfunction of 
the operators 7, and [?. Hence it is clear that the quantum number m involved 
in (30.16) characterizes the value of the angular momentum projection onto 
the z-axis in a given state, and the wave function Y,,, describes a state with a 
given total angular momentum and a given projection on the z-axis. 

Summarizing we can say that the value of the total angular momentum is 
defined according to formula (30.15) by the orbital angular momentum 
quantum number / running over a sequence of integer values. For a fixed 
value of the square of the angular momentum the projection of the angular 
momentum onto an arbitrarily oriented z-axis can take on 2/+1 values from 
—/ up to +/ (in units of %). Any other values of this angular momentum 
projection for a given / are impossible. Since the z-axis is oriented quite arbi- 
trarily, it is natural that the angular momentum projections onto the x-axis 
and y-axis for a given / also take on values form —/ up to +/. For /=0 the 


§30 THE ANGULAR MOMENTUM OPERATORS 107 


angular momentum projection onto any axis is also equal to zero. This is the 
only state in which the angular momentum projections onto different axes 
simultaneously have sharp values. In this case the function Y, (J = 0) reduces 
to a constant which is an eigenfunction of all the operators is ie Ta; 

We note that the eigenvalue of the square of the total angular momentum 
1? =72/(/+1) is always larger than the square of the maximum projection of 
the angular momentum which is equal to #2/?. If these quantities were the 
same, then this would mean that in a state in which the angular momentum 
projection onto a certain axis has its maximum value the other two projec- 
tions would be equal to zero. This is, however, impossible, since for a sharp 
value of one of the angular momentum projections the other two cannot have 
well defined values, not even zero. 

Finally, we shall show that the angular momentum operator is related to 
the operator of an infinitesimal rotation of the system about the origin. Let 
us rotate the coordinate system through a small angle 5y about, for example, 
the z-axis. The old and new coordinates of a point are connected by the rela- 
tion 


x =xtyôy, x=x —y' ôy, 
y =-xõpty, y=x'boty’, 
zi =z, z=z' 


Consequently, upon rotation the wave function W(x, y, z) expressed in terms 
of the new variables has the form 


Wæ, y, z)= Wæ -y' õp, yp’ t+x'dy, z')= 


ð ð 
UCV E -y' Bp So +x! Spik, = 


1 10 nA 
— — — | y Z)= 
1 +59(x ay’ y è) W(x, y,z) 


(: the ogi) Ve’, y’, 2’) = Wee’, y' z’). 





It is natural to call the operator W the rotation operator. We found the opera- 
tor W, of the rotation through a small angle ôy about the z-axis to be 
connected with the operator /, by the relation 


W,=1+5 ôg. (30.20) 


Such a relation also holds, of course, for any other axis. 


Le 





108 THE MATHEMATICAL APPARATUS Ch. 3 
§31. Differentiation of operators with respect to time 


We now construct the operator F corresponding to the derivative with res- 
pect to time of the quantum-mechanical quantity described by the operator F. 
It is clear that the ordinary definition of the derivative of a function is inappli- 
cable to the quantum-mechanical quantity described by the operator F. To 
define the notion of the derivative we again make use of the analogy with- 
classical mechanics. As is known, in classical mechanics the derivative with 
respect to time of a mechanical quantity F can be expressed in terms of the 
classical Poisson bracket 

dF a 
dt = yee [H, J 
where H is the Hamiltonian. 

Passing from classical quantities to quantum-mechanical operators and 
from the classical Poisson bracket to the quantum one, we obtain the expres- 
sion for the operator F 


f= a ELE (31.1) 


4 


If the operator F does not depend explicitly on time, then the operator Ô has 
the form 


F= Â, Ê] =, (ĜÊ- ÊM) . (31.2) 

The expressions for the derivative of the sum Ê and product È of two 
operators Ô and Ê 

F=D+R, (31.3) 

L=DR+DR (31.4) 


follow immediately from the properties of the quantum Poisson brackets. 
By means of formula (31.1) for the derivative of the quantum operator 
one can find the expression for the derivative with respect to time of the 
mean value of the quantity F. 
Differentiating the expression (22.4) for the mean, we find 


F= fevers for Èyav+ fuer? peu. 


We express the derivatives 0W/dr and dW*/dr in terms of the wave functions 


§31 DIFFERENTIATION OF OPERATORS WITH RESPECT TO TIME 109 


by means of the Schrödinger equation and the equation which is conjugate to 
it. We then have 


F= Jr E vav-i fyôónav+i fav Fvav, 


f@vyFvav= féwrv'ar, 


because the integral does not change when the integrands are exchanged. It 
follows from the Hermitian property of the operator H that 


f@wmrvav= fu'tivav. 
We finally obtain 


F= fy" (Grey (AF - PA) vav. (31.5) 


Comparing the above expression with the definition of the mean of the 
derivative F’, we arrive at the important equality (a= Ñ. 

As an example let us define the operators £ and Dy. Since the coordinate 
operator and momentum operator do not depend explicitly on the time, we 
have 


X=(H,3), x= [Ê ĝl]. (31.6) 


In such a form the operator equations (31.6) are analogous to the classical 
Hamilton equations. We evaluate the commutators on the right-hand sides of 
the equations (31.6), assuming that the Hamiltonian has the form 


i-l (924p 
H rn (62+ BF +B?) + UG, y2 1). 


Taking into account that the coordinate and momentum operators are 


> 
=|= 


x a 
X=x, Dice apace 


we obtain 
AP E j 
[H, x] = mh (p2x — xp2) , 


since x and U(x, y, z, t) commute. 





110 THE MATHEMATICAL APPARATUS Ch. 3 


Calculating the commutator of the operators p2 and x, we find 


We finally obtain 


x = (A, 2] =—p,. (31.7) 


i 
m 
We see that the velocity operator x is connected with the momentum opera- 
tor Py by the same relation as in classical mechanics. We find the operator Dy 
by 
D ir a UI 
[H, Bx] =z UPx— PV) 3 max 
Thus we have 


"I oÜ 
es: (31.8) 


We have obtained the operator equation of motion in the form of Newton’s 
equation. Equations (31.7) and (31.8) can also be written for the mean 
-values of the corresponding quantities 
es > a av 
beets eae (31.9) 
These last relations are called the Ehrenfest theorems. Expressing Py in terms 
of X, we find 


wea Oe, (31.10) 


In such a form this equation is very close in appearance to Newton’s equa- 
tion of classical mechanics. 


§32 CONSTANTS OF THE MOTION 111 
§32. Constants of the motion 


Suppose the operator F does not depend explicitly on the time and 
commutes with the Hamiltonian H. In this case, according to (31.2), the oper- 
ator corresponding to the derivative with respect to time is equal to zero, and 
from relation (31.5) it follows that the mean value of the quantity F does not 
change with time 


F=0. (32.1) 


The probability that, in measuring F, we shall obtain a possible value F,, is 
also constant in time. Indeed, this probability is given by the square of the 
modulus of the coefficient of expansion le (ÐI? of the wave function W(x, t) 
describing the state of the system at the instant of time ¢f in terms of the 
eigenfunctions of the operator Ê. Since, however, the operator F commutes 
with the operator H, both operators have the common eigenfunctions 


W(x, t) = v(x) exp [(—i/h) Ent] 


(see §23). The expansion of W(x, fr) in terms of the eigenfunctions of 
the operator F can be written in the form 


WG, 1) = > c,(0) exp[(—i/h) E,,¢] YE) = 2 Cn(t) Vn) - 


Consequently, 
lep)? = \c,,(0)|? = const. 


In quantum mechanics such quantities, as in classical mechanics, are usually 
called constants of the motion. From the above it is clear that a quantum- 
mechanical quantity is a constant of the motion if: (1) its operator does not 
depend explicitly on the time, (2) this operator commutes with the Hamil- 
tonian. 

Knowing the operators corresponding to different quantum-mechanical 
quantities and the Hamiltonian, one can find the conservation laws. 

Finding conservation laws in quantum mechanics is as important for the 
study of the motion of a system as in classical mechanics. As in classical 
mechanics*, the laws of conservation of momentum and angular momentum 
are closely associated with the properties of homogeneity and isotropy of 
space. Thus from the isotropy of space it follows that the Hamiltonian of a 


* See, for example, L.D.Landau and E.M.Lifshitz, Mechanics (Pergamon Press, 
Oxford, 1960). 





112 THE MATHEMATICAL APPARATUS Ch. 3 


closed system or of a system in a centrally symmetric force field must not 
change when an arbitrary infinitesimal rotation is made. Mathematically this 
is expressed by the fact that the Hamiltonian H must commute with the rota- 
tion operator W. But, as we know (see §30), the operator corresponding to 
rotation through a small angle about a certain axis (for example the z-axis) is 
related in a simple way to the operator of the component of angular momen- 
tum along this axis. Therefore a consequence of the commutation of the 
operator W, with the Hamiltonian H is the commutation of the operator /, 
with the isHaitionstny, hence the law of conservation of this quantity follows. 
The fact that we have considered rotation only through a small angle is not 
important, since a rotation through a finite angle can be resolved into a 
succession of small rotations. 

Thus we see that the conservation of angular momentum is associated 
with the isotropy of space. 

Similarly it is easily seen that momentum conservation is associated with 
the homogeneity of space. Indeed, from the homogeneity of space it follows 
that the displacement operator must not change the Hamiltonian of a closed 
system, i.e. it must commute with the Hamiltonian. However, since the dis- 
placement operator R is related to the operator of the corresponding momen- 
tum component (see §26), we arrive immediately at the momentum conser- 
vation law. 

The law of conservation of energy in a closed system or a system in a 
stationary external field can be associated with the arbitrariness of the choice 
of the zero of time (homogeneity in time). This means that the laws of 
motion of the system must not depend on the choice of the zero of time. 

We introduce the operator corresponding to the translation over a small 
time interval 51, (5), defined by the relation 


DP (51) W(x, t= W(x, 1+ 82). (32.3) 


Expanding the function W(x, t+6f) in a series in terms of the small interval 
ôt and confining ourselves to terms of the first order of small quantities, we 
obtain 


P(t) Ye, 1) = (: + s) yx, t). 
Hence it follows that the operator P st) is of the form 
P(êt)=1 +502. (32.4) 


The requirement of the independence of the laws of motion of the system 


§32 CONSTANTS OF THE MOTION 113 


of the choice of zero time is expressed by the commutation of the operator 
(8f) with the Hamiltonian of the system 


Ds) H=HDV(S1). (32.5) 
Using the expression (32.4) for Dp (ôt), we can rewrite relation (32.5) in 
the form 


OH o, (32.6) 


ðt 
But eq. (32.6) just expresses the energy conservation law. Indeed, the opera- 
tor commutes with itself, and the condition A= 0, denoting the energy 
conservation law, amounts to (32.6). 

To the existence of a constant of the motion there corresponds a simple 
property of the wave function. If the operator T corresponds to a certain 
conserved quantity, then the Schrödinger equation will be satisfied not only 
by the wave function y but also by the wave function 


p' =eialy | (32.7) 


where q@ is an arbitrary real number. By definition 
Rais > (ia)? + 
eial= ] + iad + EX p ences 


Substituting y’ into the Schrödinger equation, we find 


ay’ 


E7 


=iħ 2 ceia W)= Aeialy . (32.8) 
But, since /, as the operator of a conserved quantity, satisfies the commuta- 
tion condition JH — AT = 0, af/ar =0. = 0, we have 


a 2 (eialy) = = ela re Reidy = eid Fry s 


and eq. (32.8) is satisfied directly. 


Let us consider some simple examples. We begin with the case of a free 
particle. Then the Hamiltonian is of the form 


EEA 
Sa RE o +p2). 
Evidently, (4, By] = [A,B ill = [H, B,] =0. Consequently, 
Dy =Py =P, =0. (32.9) 








114 THE MATHEMATICAL APPARATUS Ch. 3 


If at a certain initial instant the free particle was in a state with definite 
momentum, then this value of the momentum is conserved in time. 

As another example let us consider a particle moving in the field produced 
by an infinite uniform plane (xy-plane). The potential energy of a particle in 
such a field depends only on the distance from the plane U = U(|z|), so that 
the Hamiltonian is of the form 


“4 2 
H= -4 y? + U(\zl). 

The operators Delle commute with such a Hamiltonian. This means that 
in the case of motion in the field of a uniform plane (xy) the components of 
the momentum of the particle, p, and p,, and the z-component of the angular 


momentum, /_, are conserved. 


§33. Parity 


The conservation laws considered above — the laws of conservation of 
energy, momentum and angular momentum — are the quantum-mechanical 
analogues of the conservation laws of classical mechanics. It turns out that in 
quantum mechanics there are also conservation laws which have no classical 
analogue. One such law is closely associated with the properties of space and 
is of a very general character. Namely, the Hamiltonian of a closed system 
must not change under the following transformations of the coordinates: 
(1) translation of the origin by an arbitrary segment; 

(2) rotation through an arbitrary angle; 
(3) inversion, i.e. the substitution x;> — x; in which the signs of all coordi- 
nates are changed. 

As we have seen in the preceding section, the first two transformations are 
associated with the laws of conservation of momentum and angular momen- 
tum. In quantum mechanics it turns out that inversion is associated with still 
another general conservation law. As for the translation and rotation opera- 
tors, which have been introduced earlier, one can also introduce the corres- 
ponding inversion operator / 


Iwo, ) =av(r, t), (33.1) 


where a is a constant. 
When the inversion operator / is applied twice we arrive at the initial state. 
Hence it follows that a? = 1, ie. a=+1. Thus, in general, the following is 


§33 PARITY 115 


fulfilled: 
Tyo, t) =+ yr, t), (33.2) 


i.e. the wave function itself, and not only the argument r on which it depends, 
can change sign directly under inversion. Whether the transformation of the 
wave function under inversion will have a=+1 or a=—1 depends on the 
intrinsic properties of the particles described by this wave function. 

The particles which are described by wave functions satisfying the condi- 
tion 


TW(r, t) = vr, t) 


are said to possess even intrinsic parity. On the contrary, particles which are 
described by wave functions satisfying the condition 


Wyo, t) = —Y(-r, t) 


have odd intrinsic parity. 
We assume that the Hamiltonian of a closed system has the form 


z h2 1 

A= 2) -2—v} +520 Ugri- rg) - 
i 2m; 2 i#k 

It is easily seen that this Hamiltonian does not change under the substitution 

r;> —r; i.e. it satisfies the condition JHy = HI. This means that the opera- 

tor / commutes with the Hamiltonian 


fH= HI. (33.3) 
We determine the eigenvalues À of the inversion operator 
T(x) = AYE) - (33.4) 


We apply the inversion operator to this equation once more. Since under the 
two-fold inversion we come back to the initial value of the coordinates, this 
transformation is simply 


Py. = YW, = My, =2y, . (33.5) 


Whence we find that the eigenvalues à are equal to + 1. A state with A = + 1 is 
said to have even parity or to be even. On the other hand, a state with A = — 1 
had odd parity or is odd. If the parity operator commutes with the Hamil- 
tonian operator, then the parity conservation law holds. The parity conserva- 
tion law, like other conservation laws, imposes definite restrictions upon 
possible changes of the states of a system. Namely, if the system was in an 





116 THE MATHEMATICAL APPARATUS Chis 


even state, then it will remain in such a state, and not pass over into an odd 
state. Naturally, the situation is analogous in the case of a system in an odd 


state. 
Let us determine the parity of the state of a particle with angular momen- 


tum /. The fact that the angular momentum and parity can be determined 
simultaneously follows from the commutation of the corresponding opera- 
tors: 


iro; Wiy=0; (,i}=0; {7 P}=0. (33.6) 


From the expressions for the angular momentum operators iL irae it is clear 
that they do not change under inversion. In the spherical system of coordi- 
nates the inversion has the form 


P= iR dv>7-0; y>ytrn. (33.7) 


The dependence of the wave function of a particle with definite angular 
momentum / on the angles 3, y is given by the spherical function Y,,,(0, y) 
(See §30). Under inversion (33.7) we have cos -> —cos 9 and eiMy -> 
>(— 1)" eim, From formula (30.17) it is easily established how the asso- 
ciated Legendre polynomial Pj” (¢) is transformed under a change of sign of 
its argument. Since P,(—£)=(—1)'P,(§), we obtain that Pj”(—£)= 
=(- 1y +m Pmçg). Taking into account the factor (— 1)” which is given by 
the finotion! el’ we find that under inversion the wave function on the 
whole is multiplied by the factor (— 1)!. Taking into account also the factor 
a=+ | associated with the intrinsic properties of particles, we get 


A=(— Da. (33.8) 


Thus the states with even / have even parity ifa = 1, and odd parity ifa = — 1. 
The states with odd / have, correspondingly, odd parity if a = 1, and even 
parity if a=—1. If we have a system of non-interacting particles, then the 
parity of the system is determined by the product of the parities of the indi- 
vidual particles. Indeed, in §14 we have seen that the wave function of a 
system of non-interacting particles can be written in the form of the product 
of the wave functions of the individual particles. Hence it follows immediate- 
ly that under inversion the parities of individual particles are multiplied. If 
each of the particles is in a state with definite angular momentum (motion in 
a central field), then the parity of the entire system can be written in the 


form 
v= (— 12k II ten (33.9) 


§34 THE UNCERTAINTY RELATION FOR TIME AND ENERGY 117 


where the second factor is determined by the product of the intrinsic parities 
of the particles. 

In addition to other conservation laws the parity conservation law is one 
of the most general laws of nature. The impossibility of transitions of a closed 
quantum-mechanical system from states with one parity into states with 
another parity — so-called forbidden transitions — is confirmed by a vast 
amount of experimental data in atomic as well as nuclear physics. However, 
it has been established (see §122) that the parity conservation law is not a 
universal physical law. The parity conservation law is violated in certain 
processes involving elementary particles. 


§ 34. The uncertainty relation for time and energy 


The relation between the uncertainty in the energy AF and a time interval 
At can be derived from the general apparatus of quantum mechanics, as was 
shown by Mandelshtam and Tamm*. Indeed, the total energy of a closed 
system can have no definite values which are constant in time. As we have 
explained in §32, its mean value and the probabilities of observing one or 
other possible value are constant in time. In other words, the form of the 
energy distribution function is conserved in time. 

Knowing the distribution function one can define the value of the root- 
mean-square deviation of the energy AZ, which is naturally also conserved in 
time in the usual way. The energy will have a definite value (AF = 0) only if 
the system is in a stationary state. A characteristic indication of a stationary 
state is the constancy in time of the physical quantities.of a given system. 

Let us assume that a closed system is in a state with indefinite energy £ at 
the initial instant of time. Further, let R be a quantity whose operator R does 
not depend explicitly on time. For the given quantity one can, in the usual 
way, define its root-mean-square deviation AR and the mean value R. Making 
use of (24.5) and (31.5), we write the relations 


AE AR>4\(AR — RÊ) , (34.1) 
AR =i(HR— RA). (34.2) 


* L.I.Mandelshtam and I.E.Tamm, Izvestiya Akad. Nauk SSSR, physical series, 9 
(1945) 122. 





118 THE MATHEMATICAL APPARATUS Ch. 3 


Substituting (34.2) into (34.1), we obtain correspondingly 
AE AR > IR]. (34.3) 


This relation connects the uncertainty in the energy AF, the uncertainty AR 
in the value of R, and the rate of change of the mean value of the quantity R. 
Relation (34.3) can be rewritten in a somewhat more convenient form if one 
introduces the interval Ar — the time for which the mean value of R changes 
by an amount of the order of magnitude of its root-mean-square deviation 
AR 


At er (34.4) 


Then we have 
AE At>th. (34.5) 


In particular, it follows from (34.3) that for the value of R to change with 
time, R must possess a dispersion different from zero. 

Thus we see that there is a definite relation between the dispersion of the 
total energy of the system and the rate of change of arbitrary quantities 
characterizing the system under consideration. 

As a simple example let us consider a one-dimensional wave packet. We 
take the coordinate x as the quantity R, R =x. Then AR is the width of the 
packet, and Aż is the time of flight of the packet past a certain point of 
space. The relation (34.5) shows that the transit time essentially depends on 
the dispersion of the total energy AF. 

From the inequality (34.5) then also follows a definite relation between 
the lifetime of a given state and the uncertainty in the energy, AF, of this 
state. Thus, assuming Ar to be equal to 7, the half-life, we obtain that in order 
of magnitude 


A 1 

nes Za (34.6) 
where Į is the uncertainty in the energy of the initial state and gives the 
width of the corresponding spectral line. The problem of the connection of 
the decay law with the energy distribution function is considered in more 
detail in a work of Krylov and Fok*. In this work it is also shown that rela- 
tion (34.5) cannot be applied to the measurement processes, because it is 
derived by making use of the Schrödinger equation. This follows, for example 


* I.S.Krylov and V.A.Fok, J. Exp. Theor. Phys. (USSR), 17 (1947) 93. 


§34 THE UNCERTAINTY RELATION FOR TIME AND ENERGY 119 


from the fact that a given object during the process of measurement is no 
longer a closed quantum-mechanical system. 

For the measurement processes the corresponding inequality must be 
formulated in the form of a certain physical principle (stated by Bohr) 


A(E—E')At>h , (34.7) 


where £ and £" are the values of the energy of the object before and after the 
measurement process, and A(£ — E’) is the absolute value of the uncertainty 
in the measurement of the energy of the object, i.e. the corresponding *rror 
of the measurement if it was carried out during a time Ar. 

Relation (34.7) is very important in the analysis of the results of measure- 
ments, i.e. for an experimental check of the results given by quantum mech- 
anics. We shall illustrate it by the simple example of a free particle. For the 
measurement of the quantities Æ, p, v (energy, momentum, velocity) of the 
particle it is necessary to consider the collision of this particle with another 
system (apparatus). Assuming for simplicity that the motion is one-dimen- 
sional, we write the momentum conservation law 


pt+k—p'—k'=0. (34.8) 


Here we denote by k and k’ the momentum of the apparatus before and after 
the collision. We shall denote by primes the quantities referring to the 
systems after the collision. It can be assumed that the momentum of the 
apparatus before and after the collision is accurately measured. Then from 
relation (34.8) there follows the equality of the errors in the measurement of 
the momentum of the particle before and after the collision 


Ap = Ap’. (34.9) 


The error in the measurement of the energy can be expressed in terms of the 
error in the measurement of the momentum, since 


AEE NEDAD. 


dp 
AE’ = Ap’ =v Ap’ . 
In view of the equality of the errors Ap and Ap’, we have 
A(E — E') = |v—v'l| Ap. (34.10) 
We multiply (34.10) by the time of measurement Ar. We then obtain 
A(E—E') At=|v—v'|Ap At. (34.11) 


== 


—_ LK TT 


120 THE MATHEMATICAL APPARATUS Ch. 3 


But the quantity |v—v'| Ar represents an additional error in the coordinate, 
which appeared during the time of measurement Af. The total uncertainty in 
the coordinate Ax can be written in the form 


Ax = (Ax) + |v—v'|Ar, 
where (Ax)g is the uncertainty in the coordinate of the particle which existed 
before the collision being considered. In particular, (Ax)g can be made arbi- 
trarily small. The fact that the value of (Ap)g will then be large is of no 
importance, because (Ap)p is in no way connected with the error Ap consid- 


ered. 
Heisenberg’s uncertainty relation Ap Ax >A must be fulfilled irrespective 


of the value of (Ax)g. Consequently 


ju—v'|ArAp>h . (34.12) 
Comparing with (34.11), we arrive at the inequality 
A(E-—E')At>n (34.13) 


in accordance with (34.7). We see that the error in the measurement of the 
energy tends to zero provided the measurement process lasts a sufficiently 
long time (in the limit At > ©). 

We note in addition that, as follows from (34.12), the measurement of the 
momentum for a given value of the error Ap leads to a change in the velocity 
of the particle, 


h 

Ap At’ 

and, consequently, to a change in the momentum. Only if the measurement 
is carried out during an infinitely long time (At > °°) does the momentum not 
change. Of course, a measurement of the momentum over a long period of 
time can make sense only if the particle is free. Thus we see that the process 
of measurement of the momentum in small time intervals is irreproducible. 
The measurement brings the micro-object into a completely new state 


(see §5). 


ju—v'|> 











Motion in a 
Centrally Symmetric Field 


§35. The Schrédinger equation 


We can now apply the mathematical apparatus of quantum mechanics, 
developed in the preceding chapter to the study of the properties of real 
systems. It is natural to consider, first of all, the hydrogen atom, the simplest 
atomic system. In the hydrogen atom the potential energy of interaction of 
the electron with the nucleus depends only on the distance between them, 
|r, —r2|. The problem of the motion of two particles with the interaction law 
U(|r; —r2l) amounts, as we have explained in §14, to the problem of the 
motion of one particle with reduced mass wp in a field U(r). In view of the 
large difference in the masses, the reduced mass p is very close to the mass of 
the electron. If also the size of the proton is neglected, then the hydrogen 
atom represents an electron moving in the Coulomb field of a motionless 
centre. Such a field is a particular case of a centrally symmetric field in which 
the potential energy depends only on the distance from the force centre. We 
shall first consider the motion of an electron in a centrally symmetric field of 
the most general form, after which we shall pass over to the case of the 
Coulomb field. 

The Schrödinger equation for the stationary states of a particle moving in 


121 


[l 





122 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 
a force field with potential energy U(r) has the form 
Dyn 2e 
y AETERNE (35.1) 


In the case of a centrally symmetric potential field it is convenient to trans- 
form the Schrodinger equation to spherical coordinates, since the potential 
energy depends only on the distance from the origin r. Expressing the 
Laplacian operator in spherical coordinates, we have 


l E 2u 
= Bee +—— [E-U(r)] y =0. (35.2) 
r? ar AE D r? n2 
This equation is conveniently transformed by introducing into it explicitly 
the operator of the square of the angular momentum 12. Substituting its value 
according to formula (30.9), we have 


42 (PX) ot E UY =o. (35.3) 
r? ðr ðr h?r? n2 

We shall first of all show that in the case of motion in a centrally symmet- 
ric field two more conservation laws are satisfied, in addition to the energy 
conservation law: the total angular momentum conservation law and the law 
of conservation of the z-component of the angular momentum where the 
z-axis is arbitrarily oriented in space. When we speak here of the conservation 
of total angular momentum we mean the quantity described by the operator 
P (the square of the angular momentum). For this, according to general rules, 
we consider the conditions for the commutation of the operators iz and i 
with the Hamiltonian. It is obvious that in our case the Hamiltonian A can be 


written in the form 
~ 2 72 
f= 42 (22) 4 +00. (35.4) 
2u r? ar ðr 2Qur2 
The operator 7? involves only the angular variables 3, y, and the differential 
operators with respect to these variables. Hence the operator 12 commutes 
with any operator of differentiation with respect to r, as well as with the 
operator of the coordinate v itself 
Hi2-PA=0. (35.5) 
An analogous relation also holds for the operator iL in view of the fact 
that, as we have seen in §30, it commutes with the operator I? (30.4): 


Hi. —1,H=0. (35.6) 





§35 THE SCHRODINGER EQUATION 123 


Since, in motion in a centrally symmetric field, three quantities are 
conserved — the energy, the square of the angular momentum 12, and the 
projection l, of the angular momentum onto an arbitrary axis — we shall 
consider states with given values of the three quantities. 

It should be noted that in motion in a centrally symmetric field the laws 
of conservation of energy, of total angular momentum and of the z- 
component of angular momentum also hold in classical mechanics. 

We have considered previously the states of a system with given values of 
the total angular momentum and its projection onto the z-axis. The eigen- 
values of the operators Ì2 and il. were characterized by the azimuthal and 
magnetic quantum numbers / and m, while the spherical functions Y;,,,(0, y) 
with the indices /, m were the eigenfunctions of these operators. 

Equation (35.3) allows one to separate the variables. Its angular part is the 
same as eq. (30.14). It describes the motion with given values / and m. Hence 
it is natural to seek the solution of (35.3) in the form 


VO, 8, Y=RO Yin ) - (35.7) 


Substituting expression (35.7) into eq. (35.3) and taking into account that 
12 Yj “APII + 1) Yn, we arrive at the following equation for the radial part 
of the wave function R(r): 


2 
leg (e£) (e Uae MD) eo. (35.8) 
ré dr dr h? 2 r2 





We see that the expression for the radial component R of the wave func- 
tion y depends essentially on the form of the potential energy U(r). At the 
same time the angular part Y;,,,(0, y) of the wave function is determined only 
by the value of the angular momentum of the particle (the number /) and its 
z-component (the number m). States with a given angular momentum are 
denoted by small letters: 


L= Ole a mAs £6047 
S fo Gh te ar a 


Also the parity of the state is determined by the value of the quantum 
number /. In §33 we have shown that in a state with given total angular 
momentum and given z-component of angular momentum, the parity is 
equal to (— 1)!, i.e. under inversion the spherical function Y}, goes over into 
(—1)!¥;,- Since the radial wave function, which depends on the absolute 
value of the radius vector, does not change under inversion, the transforma- 





124 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 


tion law mentioned also refers to the total wave function 


Wr, 9, 9) > (—1)' yir, 3,9) - 
Thus the states s, d, g, ... are even, while p, f, h, ... are odd (for even intrinsic 


parity). 
The probability that an electron in the state Y(r, 3, y) = R(r) Yn (9, p) 
will be observed in an infinitesimal volume element with coordinates r, 3, y, 


is given by the formula 
dW(r, 9, 9) = 1(r, 9, p)? 72 drdQ , (35.9) 


where dQ = sin dY dy. If this expression is integrated with respect to all 
values of the angles 9, y, then we shall obtain the probability of observing the 
electron in a spherical layer between r and r+dr 


dW(r) = IR(r)l?r? dr . (35.10) 


Integrating (35.9) with respect to all values of the radius r from 0 to %, we 
find the probability dW(9, y) of observing the electron in the solid angle dQ 
in the direction defined by the angles 3, y 


AWO, 9) = lY ml dQ . (85.11) 


It follows from the definition of the spherical function (30.16) that the 
last expression does not depend on the angle y. This means that in the plane 
perpendicular to the z-axis the distribution of the probability of finding the 
particle is completely symmetric. It should be noted that we understand the 
z-axis to be an arbitrarily chosen direction in space; the projection of the 
angular momentum onto this direction is fixed. Thus it follows from (35.11) 
that 


dW, ~ IP” (cos 0)? dQ (35.12) 


The probability distribution (35.12) is determined by the two quantum 
numbers / and m, i.e. it depends on the value of the total angular momentum 
and its projection on the z-axis. 

The state with /=0 (s-state) possesses spherical symmetry, because for 
/=0 (consequently also mm = 0) PB = const 


dWog = qu a. (35.13) 


In the p-state (/ = 1) the probability distribution is given by the following 


§35 THE SCHRODINGER EQUATION 125 
expressions: 
=a) eeci To) 
dW) +1 = gz sin 0dQ , dWio = gr 060s 0dQ . (35.14) 


The distributions (35.12) for different / and m are presented graphically in 
fig. V.9 in the form of polar diagrams. The probability dW,,,/dQ is plotted on 
the radius vector drawn at the angle 0 to the z-axis. 

Let us consider eq. (35.8) for the radial component of the wave function 
in more detail. First of all, it follows from this equation that the energy of 
the particle does not depend on the z-component of the angular momentum. 
This is associated with the fact that in a spherically symmetric field all direc- 
tions are equivalent. Thus the isotropy of space leads to degeneracy of the 
levels of the system in which the energy does not depend on the quantum 
number m. It should be noted that degeneracy is always due to definite 
symmetry properties of the system considered. 

Instead of the function R it is convenient to introduce the function x(r): 


R(r)= 5 xr). (35.15) 


For x(r) we find 


d2x „2u h2 UL +1) 
= rs (e- U(r) — on Z ED) =0 ` (85.16) 


The condition of the finiteness of the wave function for r= 0 leads to the 
requirement 





x(0)=0. (35.17) 
The equation for the radial function (35.8) amounts to the equation of a 
one-dimensional motion with an effective potential energy equal to 


Uen) = U(r) +— ZD, (35.18) 
r 


As in classical mechanics, the sire h2I(1 + 1)/2ur? is called the centrifugal 
energy. 

Without fixing the detailed form of the potential energy U(r), one can 
nevertheless make definite conclusions about the behaviour of the wave func- 
tion near the origin and at very large distances from the force centre. 

Let us first study the region of small distances r > 0. We assume that near 
the origin the potential energy of interaction U(r) changes so slowly that the 


i 
| 





Ch. 4 


MOTION IN A CENTRALLY SYMMETRIC FIELD 


bd 
N 
= 


€-=w z-=w l-=wW O=w l+=w Z+= wW ce+=w c= 
$U0J}2919 — } X 
z-=w -=w ow t= w ctw z= 


t-sw ow tmu, t=) ow 


$U0.}99}9 -S o-) 





§35 THE SCHRODINGER EQUATION 4 127 


following condition holds: 


lim r2U(r)=0. (35.19) 


r>0 


This condition means that |U(r)| for r > O increases more slowly than 1/r2. It 
is fulfilled, in particular, for the electron in the Coulomb field of the nucleus. 
Then in eq. (35.16) for r+ 0 the terms Ex and U(r)x can be disregarded in 
comparison with the term h21(1 + 1)/2ur? x, and we obtain 

2 

£ + 
dex 2 UE) O. 


2 2 
re re 


a 


We seek the solution of the last equation in the form x = Ar’. Substituting 
this expression into the equation, we have 


y(y—1)=I(l + 1). (35.20) 


Equation (35.20) has two roots: y} =/+1; y2=—/. We must discard the 
second root, since it corresponds to a function R which increases indefinitely 
for r> 0. Thus we find that at small distances x(r) ~ r!*!, and the radial part 
of the wave function is expressed by the formula 


R(r)=Ar!. (35.21) 


The probability of finding the particle at a given distance r from the 
centre, independent of the angles 3 and y, is given by the square of the 
modulus of the radial function, i.e. by the quantity IR|?r? dr. 

It follows from (35.21) that for small r this probability is proportional to 
r2!*2 dr and is smaller the larger / becomes. The centrifugal force acts as if to 
throw the particle out from the centre. 

Further, we study the asymptotic behaviour of the wave function at large 
distances from the origin. At such distances the force acting on the particle 
tends to zero, and consequently, the potential energy U(r) tends to a 
constant. If not specified otherwise, we shall choose this constant as the 
zero of potential energy, i.e. we shall assume that lim», oo U(r) = 0. Then 
in eq. (35.16), for large r, the terms Ux and 7i2/(J + 1)/2ur2x can be disre- 
garded* in comparison with the term £x. In this case eq. (35.16) assumes the 


* From a more detailed analysis it follows that this is legitimate if the potential ener- 
gy at infinity decreases according to the law 1/7”, where 1 > 1. See, for example, L.D. 
Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 1965); V.A. 


Fok, Nachala kvantovoi mekhaniki (Principles of quantum mechanics) (KUBUCH, 1932) 
p. 126. 











128 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 


form 
CAN O0, k= GE (35.22) 
dr? h2 

The solution of this last equation can obviously be written as 
X=A eik + Ane kr , (35.23) 


where A, and A} are integration constants. 
Let us consider, first of all, the solutions corresponding to positive values 

of the energy. For £ > 0 the quantity k defined by formula (35.22) is real. 

The radial part of the wave function (35.15) amounts to the sum of the two 

functions ; 

e-ikr 


=, (85.24) 





eikr 
RQ) =A, = +tA2 


Since both terms are restricted in modulus, neither of the constants A, and 
Ay can be equal to zero. At a large distance from the force centre the radial 
function represents the superposition of a converging and a diverging spheri- 
cal wave. 

A definite conclusion can be also made about the energy spectrum of a 
particle for an arbitrary form of the energy of interaction U(r). Indeed, the 
function (35.23) does not reduce to zero at infinity, which corresponds to an 
infinite motion, i.e. a motion in which the particle or the system goes off to 
infinity. The integral of the square of the modulus of function (35.24), taken 
over all space, diverges. But, as we have noted in § 16, such functions corres- 
pond to a continuous spectrum. Consequently, for E>O the energy 
spectrum is continuous. If the radial component of the current density is 
equal to zero, then function (35.24) must be real. Correspondingly we assume 
that 


Ay=xA'ela, Ay=— Alenia, (35.25) 


A’ and a being real. 
Then corresponding to (35.24) the radial function R assumes the form 


R=A' ee). (35.26) 


where the phase a depends on k, l, as well as on the actual form of the func- 
tion U(r). In the following section we shall show that for a free particle 


§35 THE SCHRODINGER EQUATION 129 
(U=0) 

a=—4ln. 
In accordance with this we assume that 

a=—3ln+5,, (35.27) 


where the phases 5, are directly connected with the action of the force field 
on the particle and reduce to zero for free motion. 

We now consider the region of negative energies, EF < 0. Since the kinetic 
energy of the particle is always positive, the total energy can be negative only 
in the case of attraction of the particle towards the centre. If Æ <0, the 
quantity k has purely imaginary values, i.e. k =ik, where x =(— QuE/h)3. 
The radial function (35.24) is written in the form 


eKr ekr 


R=Ay +A. (35.28) 





In order to satisfy the requirement of the finiteness of the wave function 
for r > œ, we have to assume that the constant A is equal to zero 


e7kr 





R=A, (35.29) 


PNS 
Then the radial wave function R tends to zero as r > œ. This means that the 
probability of finding the particle at an infinitely large distance from the 
force centre is equal to zero. Consequently the motion of the particle is finite. 
We see that there is a similarity between the conclusions of quantum and 
classical mechanics: for a positive total energy (E > U()) the particles go off 
to infinity, while for a negative total energy they perform a finite motion. 

Let us now consider the energy spectrum for Æ <0. As we have explained, 
a finite motion corresponds to these energies and the corresponding wave 
functions (35.29) are quadratically integrable. Such wave functions, as was 
pointed out in §16, belong to a discrete spectrum. Consequently, for £ <0 
we have a discrete energy spectrum. 

The general solution of the Schrodinger equation (35.2) can be written in 
the form of a superposition of the wave functions (35.7) 


Wr, 0, 9) = P2 BimR) Yi (9,9) - (35.30) 


For a solution which does not depend on the angle y we obtain a simpler 


Se 








130 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 


expression (superposition of states with mm = 0) 


vr, 9) = 2 cıR;(r)P;(cos 9) . (35.31) 


§36. The free motion of a particle with given angular momentum 


So far we have represented a freely moving particle by a plane wave 
eïk-r- w1) where k is the wave vector of the particle k = p/h, and w = E/h. 
This wave function describes a stationary state with a definite value of the 
momentum and energy E = p2/2m of the particle. For what follows, we need 
to find the wave functions of the stationary states of a freely moving particle, 
in which, in addition to a definite value of the energy Æ, the values of the 
angular momentum and the z-component of the angular momentum are also 
given. In classical mechanics a free particle moving with a definite momentum 
also possesses automatically a definite angular momentum. In quantum 
mechanics the situation is fundamentally altered. In a state with a given 
momentum the angular momentum is an indefinite quantity. On the other 
hand, in a state where the angular momentum and its projection onto the 
z-axis are given the direction of the momentum is indefinite. This is asso- 
ciated with the fact that the corresponding quantities cannot simultaneously 
have sharp values. 

In order to find the wave function required, let us consider the motion of 
a free particle in spherical coordinates. Setting U(r) =0 in the Schrödinger 
equation (35.3), we have 


m1 ð 22) P 
ONAN ey ery | 36.1 
2m r2 A ðr) 2mr? K Gs) 


We seek the wave function of the free particle in the form 


Veins 0, y) = Ryl) Yin (9, y) . (36.2) 


In this case the radial function Rọ; must satisfy eq. (35.8) in which one must 
set U=0 
1 d (=u) ( 5 KD) 
SS (|) Ne NO 36.3 
r? dr dr r? H CS 


Here we have expressed the energy Æ in terms of the wave number k. For 


§36 PARTICLE WITH GIVEN ANGULAR MOMENTUM 131 


l= 0 the equation is rewritten in the form 
dR 
et (? — 1) HKR = 0. (36.4) 
ré dr dr 


The solution of the above equation which does not go to infinity at the 
origin is the function 





(36.5) 


To find the solution of eq. (36.3) for / #0 we introduce a new function given 
by the formula 


Ry=s7?Z (36.6) 


where s= kr. For such a substitution eq. (36.3) is easily transformed to the 
form 





2 1y2 
EF el (1-2 )z=0. 


(36.7) 
ds? s ds sS 


The solution of eq. (36.7) satisfying the condition of finiteness of the 
wave function at the origin is a Bessel function of half-integer order 


Z(s)= CIs) ; (36.8) 
Correspondingly for the radial function we have 
Rp = (kr) 4 CI (kr) . (36.9) 


At a large distance from the origin (r >) one can make use of the known 
asymptotic expression for the Bessel function and obtain the asymptotic 
value R;/(r) 


A 1 
Ryl) = CQ/n)3 ee) (36.10) 


The constant C is determined by the normalization condition. At small 
distances from the force centre (r > 0) the radial function (36.9) assumes the 
form 


Ry~r! (36.11) 


in accordance with the general expression (35.21). 











132 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 


§37. The spherical well 


As a simple and at the same time important example we shall consider the 
motion of a particle in a centrally symmetric field defined by the expression 


-Uo (r <a) t 
0 (r>a). 


A field of this type is called a spherically symmetric potential well. The 
potential well shown in fig. V.10 represents an idealized model of a system in 
which the interaction with the centre is realized by so-called short-range 
forces. Short-range forces are understood to be forces which decrease with 
distance so rapidly that they can be assumed to be practically equal to zero at 
distances exceeding a certain distance, a, called the range of the short-range 
force. The importance of the consideration of systems with short-range 
forces is clear, for example, from the fact that the forces of interaction 
between nucleons, nuclear forces, are of such a type. 


U(r) = 


u 
o a 
z 
W 
Fig. V.10 


The idealization of a system by means of the model of a spherical potential 
well amounts to the assumptions of total isotropy of the forces and the 
constancy of the potential energy forr <a. 

For simplicity let us consider the motion of a particle with angular 
momentum /= 0. It is obvious that two different modes of motion are possi- 
ble. For E <0 the total energy of the particle is smaller than the potential 
energy at infinity, which corresponds to a finite motion. On the contrary, for 
E> 0 there is an infinite motion. To the first case, to which we now confine 


§37 THE SPHERICAL WELL 133 


ourselves, there corresponds a discrete energy spectrum, while to the second 
case there corresponds a continuous spectrum. 

The wave function of a particle with / = 0 depends only on the coordinate 
r, and not on the angles 3 and y. Upon the substitution x(r) =rR(r) the 
Schrédinger equation will have the form (35.16) 


aX 4 2 e+ Up) x= 0 < 37.1 

StF E+Ug)x=0 (<a), (37.1) | 
j 

GEX 2m A 

A EE =0 (r>a). (37.2) 


We write the solution of eq. (37.1) in the form 
x(r) =A sin kr +B cosKr , (37.3) 


where 


gE (FF u- ie) i 


For the wave function R to be finite at the origin it is necessary to set 
x(0) = 0. Consequently, inside the well the solution of eq. (37.1) has the form 


x(r) =A sinkr. (37.4) 


The solution outside the well which reduces to zero at infinity is expressed by 
the formula 


x(r) =Be—" , (87.5) 


where x’ denotes the quantity x’ = [(2m/h2) |El] ż. 

It follows from the continuity of the wave function that solution (37.4) 
must go over continuously into solution (37.5) at the surface of the sphere 
r=a. The derivative of the wave function also must be continuous at this 
surface. Hence we can equate to each other the logarithmic derivative of the 
functions (37.4) and (37.5) forr =a. We then obtain 


k cotan Ka = —K' . (37.6) 


This relation can be rewritten in the form 


2 raf 
sin ka =+ [Se i| í (37.7) 
K2 





i 


2 neee e i 





134 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 





Fig. V.11 


or, taking into account the expressions for x and k’, we have 


A2 \2 
sin ka =+ ( Ka. (37.8) 
2mU a” 
The roots of eq. (37.8) determine the energy levels of the particle in the 
well, Equation (37.8) is conveniently solved graphically. Namely, the roots of 
eq. (37.8) are the intersections of the straight lines 


= h? ) a and y]= ( h? ) xa 

Al Gre vl 2mU a? 
with the curve sin ka (see fig. V.11). Only crossover points for which cotan ka 
has negative values may be chosen in correspondence with (37.6). From the 
graph in fig. V.11 it is seen that the roots of eq. (37.8) do not always exist. In 
order that a bound state (energy level) may exist the well must be sufficiently 
deep. Let us determine the minimum depth Ug min corresponding to the 
appearance of the first energy level. As is seen from fig. V.11, the first level 
will appear when the straight line passes through the peak of the sinusoidal 
curve at ka = 4n. The tangent of the slope angle is equal to 2/7. Consequently, 
the minimum potential energy Ug min for which there is a bound state of the 
particle in the spherical well is determined by the condition 


( h2 )*- 2 
2mUp ming? mT 4 





whence 


2}2 
zah (37.9) 


Vomin = gig? 


§38 MOTION IN A COULOMB FIELD 135 


We find the first energy level in the potential well of minimum depth 
Up min from the condition ka = 47 or 


2ma2 4 FN 
(= (Uo min` IE1D) Te 


Taking into account the value of Ug min We find that £} = 0, i.e. the energy of 
the particle in the first level is equal to zero and there are no other levels in 
the well. Also, the energy of the first level decreases with increasing depth of 
the well and becomes negative. In the graph this corresponds to a decrease in 
the slope of the straight line with respect to the abscissa. For a certain slope 
another root will appear in addition to the root corresponding to the first 
level. This new root corresponds to the appearance of a second energy level in 
the well. The number of crossover points in the graph increases with increas- 
ing depth of the well, which corresponds to an increase in the number of 
allowed energy levels of the particle in the potential well. 

In conclusion we stress that the absence of bound states for a particle in a 
potential well of depth Ug < Ug min represents a specific quantum-mechanical 
effect which has no analogue in classical physics. Indeed, however small the 
depth of the well in classical physics, a particle which falls into it with an 
initial kinetic energy less than the depth of the well will be confined in it. In 
quantum mechanics this proposition, in general, does not hold. 


§38. Motion in a Coulomb field 


As we have already pointed out, the most important example of the 
motion of a particle in a centrally symmetric field is the motion of an elec- 
tron in the Coulomb field of the atomic nucleus. The simplest atomic system 
of such a kind, consisting of a nucleus and an electron, is the hydrogen atom, 
and also the ion of any atom in which only one electron remains. The meso- 
hydrogen atom, consisting of a proton and a negatively charged meson, is 
another example. 

The problem of the motion of two bodies, a nucleus and an electron, 
reduces to the problem of the motion of one particle with reduced mass pin 
the Coulomb field (see §14). 

It is clear that the theory of the hydrogen atom and hydrogen-like systems 
is extremely important, since these systems are the simplest atomic systems. 
Furthermore it turns out that in the case of the motion of a particle in the 
Coulomb field of a nucleus one can obtain a complete analytical solution of 


136 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 


the Schrédinger equation. This makes it possible to follow the appearance of 
general quantum-mechanical regularities in atomic systems. 

The potential energy of an electron moving in the field of a nucleus with 
charge Ze is given by the formula 


un=- 2. (38.1) 
We write the Schrödinger equation for the radial wave function (35.8) 
2 
ete) Ri 2E E (e42) =o. (38.2) 
a r dr r2 h? r 


We are at first interested in states belonging to a discrete energy spectrum. 
In correspondence with §16 these states correspond to a finite motion of the 
electron and, consequently, their energy is negative, E < 0 (see §35). 

In solving eq. (38.2) it is convenient to use dimensionless quantities. This 
will make all formulae less cumbersome. 

We choose as basic quantities the charge of the electron e, its reduced mass 
u, and the Planck constant#. From these quantities one can make a combina- 
tion having the dimensionality of a length 


2 
eee (38.3) 


As we shall see below, this length is a characteristic atomic dimension. If 
the reduced mass p is set equal to the mass of the electron m, then 
a=0.529X10-8 cm. 

The system of units based on the quantities e,u and a is called the 
Coulomb system. 

The quantity e?/ħ, equal to 1/137 of the velocity of light, will be the unit 
of velocity, while the quantity 


AD. 
pet _e 
Jin Sy 38.4 
ae (38.4) 
will be the unit of energy. For u =m, Ey = 4.30X 10711 erg = 27.07 eV. We 
introduce into eq. (38.2) the dimensionless variable p and the energy e€ 


=r/a, e=—E/E. (38.5) 
Then this equation is rewritten in the form 
2 
R2 R, +(- 2D 47) pao, (38.6) 
dp~ p dp p p 





§38 MOTION IN A COULOMB FIELD 137 


At small distances the function R behaves, according to (35.21), as pl. At 
large distances this function has the form R ~ exp [—(2e)2p] (see (35.29)). 
Corresponding to this we shall seek the solution of eq. (38.6) in the form 


R(p) = ple u(p) . (38.7) 


where B = (2e)r. Substituting expression (38.7) into (38.6), we obtain after 
simple calculations 


2 
po +222 e 1) +20 —B-B)=0. (38.8) 
do? dp 


We introduce a new variable 
£= 2p . (38.9) 
Denoting differentiation with respect to this new variable by a prime, we 
have 


p" +val+2—H+0(Z-1-1) =0. (38.10) 


The radial wave function R must remain finite over the entire region of 
variation of the variable £, for £ > œ as well as for £ > 0. 
We seek the solution of eq. (38.10) in the form of a series 


v(—)= 2) agtk . (38.11) 
k=0 


Substituting (38.11) into eq. (38.10) and gathering terms with the same 
powers of £, we obtain 


2 gk [« +1)(21 +2 +k)aç4ı + (5 1 ~x) ax =0. (38.12) 
Equation (38.12) will be satisfied for arbitrary values of ¢ if the coefficients 
of all powers of & are equal to zero. Hence, equating the square bracket to 
zero we arrive at the following recurrence formula: 


_ k+1+1-—(Z/8) 


k+l” (K+ 1)(QU+2+k) KS (38.13) 


We note that the function v defined by the series (38.11), with coefficients 
a, which satisfy (38.13), can be expressed in terms of the confluent hyper- 


138 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 
geometric function* 
Z 
v=AF(1+1-F, 21+2,6). (38.14) 


It is easily shown, by analogy with what was done in § 10, that the series 
(38.11) diverges as eë for >. This means that if the wave function were 
expressed by the series (38.11) it would not satisfy the condition of being 
finite at arbitrarily large distances from the force centre. In order to define 
the function which possesses the necessary properties and is a solution of eq. 
(38.10) we must, as was done in solving the problem of the oscillator, cut off 
the series at a certain term,i.e. reduce it to a polynomial. If for a certain value 
of the number k =n, the coefficient @,,,4 1 reduces to zero, then according to 
(38.13) all subsequent coefficients Qn +2+4n, +3 and so on also reduce to zero, 
In this case the infinite series reduces to a polynomial of the ,th degree. For 
large values of ¢ the function v(¢) will increase according to the power law 
v(E) ~ "1, while the wave function will tend to zero at infinity on account of 
the exponential factor. For +0 the polynomial v(¢) tends to the constant 
quantity ag and the wave function (38.7) correspondingly reduces to zero or 
tends to a constant. Thus we see that the wave function will satisfy the 
standard boundary conditions. 

Let us now consider the conditions under which the coefficient of the 
series a„,+ 1 reduces to zero. For this it is necessary, according to (38.13), that 


ngtl+ 1-2-0, (38.15) 


Since n, is an integer (including zero), the sum (”, +/+ 1) is also an integer. 
We denote it by n; n =n, +/+ 1. The integer 77 is called the principal quantum 
number, and n, is called the radial quantum number. For a fixed value of the 
angular momentum quantum number / we have 


n21+1. 


It is obvious that relation (38.15) determines the ordering of the energy 
levels of the system. Taking into account the value of 8, we find 


2 
pee) (38.16) 


* V.1.Smirnoy, A course of higher mathematics (Pergamon Press, Oxford, 1964). 





§38 MOTION IN A COULOMB FIELD 139 


Passing over from atomic units to ordinary units, (38.4) and (38.5), we obtain 


472 72 
ee oe aS ev (38.17) 
ig 2h2n2 n2 





This formula, first obtained by N. Bohr before the appearance of modern 
quantum mechanics, determines the discrete energy levels in the hydrogen 
atom and hydrogen-like ions. We see that the energy levels depend only on 
the principal quantum number n. The lowest energy level (the ground state) 
of the particle in the Coulomb field corresponds to the value n= 1. The 
spacing between levels decreases with increasing n, the levels coming nearer to 
each other. As n>, AF > 0 and the discrete spectrum goes over into a 
continuous one. 

The radial function R,,, is given by the formula 


R „i= const Ele-€/2 v(é) , (38.18) 


where the polynomial v(¢), with coefficients determined by the recurrence 
formula (38.13), coincides except for a constant factor with the generalized 
Laguerre polynomial*. Hence in our case the radial function assumes the 
form 


i 
Ry =A nk e E2 7 Ue) - (38.19) 


The generalized Laguerre polynomial L} (¢).is expressed in terms of the 
derivatives of the Laguerre polynomials which are determined by the relation 


L,G) =ef r (e55), (38.20) 
so that 
a qa” 
TAKES) agen OR (38.21) 


The coefficients A, , of (38.19) are determined from the normalization 
condition**. The radial wave functions belonging, for example, to the two 
lowest energy levels have the form 


3 2 
Rio?) = I) e-Zp , (38.22) 
a 


* See the reference on p. 138. 
** The calculation of the normalization integral is carried out, for example, in the 


book of L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965). 








140 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 


3 1 

Rago) (25) 720 420), (38.23) 
3\4 A 

R31) = a) e322 }Zp . (38.24) 


Here the variable p (see (38.9)) is again introduced instead of &. We stress that 
the wave function is determined by the whole set of values of the three 
quantum numbers n, l, and m, whereas the energy levels (38.17) depend only 
on the principal quantum number n. Thus the energy levels of the hydrogen 
atom are degenerate. We have seen in §35 that degeneracy in the magnetic 
quantum number 7 is a general property of motion in a centrally symmetric 
field. However, in a Coulomb field the energy levels turn out to also be degen- 
erate in the angular momentum quantum number /. This degeneracy is charac- 
teristic only for motion in a Coulomb field. A slight change in the law of 
force and the energy becomes dependent on the angular momentum quantum 
number. Hence the degeneracy characteristic of the Coulomb field is called an 
accidental degeneracy. Let us find the multiplicity of the degeneracy of the 
nth energy level. Since for a given n the angular momentum quantum number 
runs over all integers from 0 up to n— 1 and, in its turn, to each / there corres- 
pond 2/+1 possible values of the quantum number 7m, the degeneracy is equal 


to 


n-1 


2 @l+ 1)=n2. (38.25) 


To each energy level £,, there belong n? different wave functions. 

Let us consider in more detail the energy levels of the hydrogen atom. 
They are given by formula (38.17), in which one must set Z = 1. The energy 
of the ground state determines the ionization potential of the hydrogen atom. 
According to the quantum theory of light emission, the differences between 
the energy states determine the frequency of electromagnetic waves emitted 


by the atom (see § 103): 

hw=E,,-E, - (38.26) 
The quantity £,,/f is called the spectral term. The differences between these 
spectral terms determine the frequencies of radiation. Substituting expression 
(38.17) into formula (38.26), we obtain 


es (eels 
Pres a 5) m>n. (38.27) 


§38 MOTION IN A COULOMB FIELD 141 


The quantity R is called the Rydberg constant 


R= 2H = 3.271015 sec]. (38.38) 
4nh3 
All frequencies referring to transitions to one and the same lower level 
form a spectral series. Thus if we set n = 1 in formula (38.27), we obtain the 
Lyman series. It lies in the ultraviolet part of the spectrum. The transitions to 
the level n = 2 lie in the visible part of the spectrum. The whole set of these 
spectral lines forms the Balmer series. The spectral series corresponding to 
transitions to the levels » =3 and so on lie in the infrared region of the 
spectrum. For hydrogen-like ions the corresponding spectral lines are shifted 
toward shorter wavelengths, because the frequencies increase by a factor of 
22 
Further, we find the probability (35.10) of observing the electron in differ- 
ent quantum states at a given distance r from the nucleus. The ground state of 
the electron in the hydrogen atom is described by the wave function Yj Q9 = 
=R Q¥oo0- For / = 0,m = 0 the angular part of the wave function reduces toa 
constant (see §30), i.e. the state is spherically symmetric. The probability of 
observing the electron in the ground state jog at a given distance from the 
nucleus is given by the expression 


dW 19 = |W yoo! 4ar? dr . 


Making use of (38.22), we obtain 
dW yy =£ e-2rlar2dr , (88.29) 
a 


We see that the probability is different from zero over all space, although 
it decreases rapidly with increasing r. A simple calculation shows that the 
curve dW}ọ/dr has a maximum at the distance r =a, where the quantity a is 
determined by formula (38.3) and is called the Bohr radius. The form of the 
function |R,,|?7? for different n and / is shown in fig. V.12. The distance 
from the centre p =r/a is measured along the abscissa, and the probability 
density a?|R „|? p? is measured along the ordinate. We note that the number 
of zeros of the radial wave function R,, is equal to the value of the radial 
quantum number n,. At large distances the radial wave function has the form 

Ra~ e—Zr/na (2) pot + 


na 


(38.30) 


The probability density calculated by means of this function rapidly decreases 
at distances above the order of magnitude of na/Z. Hence it is seen that the 


a 


a a -g e 





142 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 








0.1 5 10 15 20 25 30 
P 





(0) 5 10 15 20 25 30 
d-states (/=2) and 4f-state 


Fig. V.12 


quantity na/Z characterizes the size of the atom, because the probability of 
observing the electron at larger distances is very small. 

Up to now we have considered the bound states of an electron in the 
Coulomb field of the nucleus. Other negatively charged particles, for example 
m-mesons and muons, can also be in a bound state in the Coulomb field. As we 
have already mentioned, such systems are called mesic atoms. 

As the simplest example let us consider the mesic atom of hydrogen or, as 
it is called, mesohydrogen. The energy levels of mesohydrogen and the wave 
functions of the meson are given by formulae (38.17)—(38.19) in which, 
however, the reduced mass p of the electron must be replaced by the reduced 
mass u’ of the meson. The effective size of the atom of mesohydrogen is 
determined by the value of a’ = (u/u’)a, which is substantially smaller than 
the effective size of the hydrogen atom. In particular, the mass of the m~- 
meson is equal to 273 electron masses and, correspondingly, a’ ~ 0.2X 10710 
cm. In the mesohydrogen atom the m-meson is situated at considerably 


§38 MOTION IN A COULOMB FIELD 143 


smaller distances from the nucleus than the electron. The presence of the 
nuclear interaction of the 7-meson with the nucleus leads to a displacement 
of the energy levels (38.17) obtained for the pure Coulomb field. Experi- 
mental investigation of this displacement allows one to draw certain conclu- 
sions on the character of the nuclear interaction of 7-mesons and nucleons. It 
should be noted that the lifetime of mesic atoms is restricted by the lifetime 
of the mesons themselves. As is known, mesons are unstable particles under- 
going decay with a mean lifetime 7 which is characteristic of the given kind of 
meson. 

Up to now we have restricted ourselves to the consideration of the discrete 
energy spectrum, i.e. we have considered the energy to be negative. 

Let us now consider the continuous energy spectrum £ > 0, e = —E/Ey <0 
(38.5). We introduce the following notation taking into account (38.7), 
(38.9) and (38.15): 


B = (2€) = i(2E/E)? = ik , n=Z/6=—iZ/k, £ =2ikp . (38.31) 


Making use of (38.7), (38.14) and (38.31), we write the radial wave function 
of the continuous spectrum in the form 


Ck ' Z 
Ses le—ik PER i 
Ru= grs pi Pe eF (1+1; 21+2, aip) , (88.32) 


Here C; is a normalization factor. 
If the functions R;; are normalized to the ô-function in k, then this factor 


is equal to 
Cy = (yp pe zea (ae 1-2) : 


m 
The asymptotic expression of the radial wave function for large p is deter- 
mined by the formula* 


(88.33) 








DNE ia 
Ry (=) zpsin (ke +2 in kp £1 +6,] (38.34) 


where 
E HZ 
ô; arg P'(1+1-12) 


*See L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 


1965); V.A.Fok, Nachala kvantovoi mekhaniki (Principles of quantum mechanics), 
(KUBUCH, 1932) p. 155. 











144 MOTION IN A CENTRALLY SYMMETRIC FIELD Ch. 4 


(T is the gamma-function of a complex variable. Its argument is equal to 
6). ’ 

The expression of the wave function (38.34) (Coulomb field) differs from 
the general asymptotic expression of the radial wave function in a centrally 


symmetric field (35.26) by the presence of the slowly increasing logarithmic 


term in the argument of the sine. 


7tras 





The Quasi-classical Approximation 


§39. The limiting transition to classical mechanics 


We have more than once referred to the existence of the correspondence 
principle and the rules for the transition of the relations of quantum mechan- 
ics into the formulae of classical mechanics for 7 > 0. We shall now define 
more precisely the conditions of this transition and we shall at the same time 
obtain an important approximate method of solving the Schrödinger equa- 
tion* (the WKB method). 

If one sets A = O in the Schrodinger equation 


ow We <5) ) 
w= |—-~—V-+U } ¥ 39.1 
ðt ( 2m ee ( ) 
it becomes meaningless. Hence to carry out the limiting transition mentioned 
above we write the wave function y in the form 


Y= eths , (39.2) 


* The Wentzel—Kramers—Brillouin method. G.Wentzel, Z. Phys. 38 (1926) 518; 
L.Brillouin, Comptes Rendus 183 (1926) 24, J. de Physique 7 (1926) 353; H.A.Kramers, 
Z. Phys. 39 (1926) 828, J.Jeffreys, Proc. London Math. Soc. (2) 23` (1923) 428. 


145 








146 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 


Substituting this expression into eq. (39.1), we obtain an equation for the 
function S: 


-8-1 IA l vs)2- Z ys +U, (39.3) 


We now formally expand the function S in powers of h/i 


2 
s=59+(*) sı+ (2) a: (39.4) 


We substitute the expansion (39.4) into eq. (39.3) and equate the coefficients 
of the same powers of 7f. We obtain two equations 


aSo 1 > 
FOE Din (VSo)*+U, (39.6) 

_ 3s OS) _ 1 

magia OV) m SVEN o (39.6) 


to within terms proportional to the first power of A. Equation (39.5) is the 
same as the Hamilton—Jacobi equation of classical mechanics* for the action 
function Sp. This means that in the zeroth order approximation the motion 
of the particle follows the classical trajectory. To elucidate the meaning of 
eq. (39.6) we write the expression for the probability density of finding the 
particle at a given point of space in the form 


= |y? =e%1, (39.7) 
Multiplying (39.6) by p and taking into account that 
pE S1 S; 2 
ar 2 ar? Vp=2VSip , 
we obtain 
dp 1 
= = 1 (VSoVp + pV7So) = y. (2 ovso) 5 (39.8) 


Equation (39.8), equivalent to eq. (39.6), represents a continuity equation. 
It shows that the probability density moves in space with the same velocity 
v=m!VSo and on the same trajectory as the particle would move in classi- 


* For the Hamilton—Jacobi equation in classical mechanics see L.D.Landau and 
E.M.Lifshitz, Mechanics (Pergamon Press, Oxford, 1960); H.Goldstein, Classical mechan- 
ics (Addison-Wesley, Cambridge, Mass., 1950). 


§39 LIMITING TRANSITION TO CLASSICAL MECHANICS 147 


cal mechanics. We note that, since the velocity is directed along the normal to 
the surfaces Sọ = const, the trajectories of the classical particle are orthogonal 
to the surfaces Sg = const. In the quasi-classical approximation it is natural to 
call the surfaces S = const the equi-phase surfaces of the wave function. 

We now find the wave function of the stationary states of the particle in 
the quasi-classical approximation, confining ourselves to one-dimensional 
motion, so that y = W(x, t). Because of the stationary state we have 


W(x, t) =e AMET y(x). (39.9) 
In correspondence with this, in formulae (39.2) and (39.4) we set 
So(x, t) = —Et + Sox) , (39.10) 


while the functions S}, S2, ... and so on can be assumed to be independent of 
time. 
From eq. (39.5) we obtain 


1 (dSo\2 
E-r E) + U(x) , (39.11) 
whence 
So) = + [[2m(E - UGd))]# dx =+ f p(x) dx, (39.12) 
where 


p(x) = [2m(E — U(x))]? - 


As was to be expected, we have obtained the ordinary formulae of classical 
mechanics. 


We can now determine the function S} from eq. (39.6). Taking into 
account that it is constant in time, we find 


dSp dS; 1 d*Sq 

Lavoe =0 (39.13) 
or 

dS; 1 d?So/dx? 1 dp 

en TaStice min aout (39.14) 


On integrating we obtain 


S,;=—4Inp (39.15) 





148 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 


(we shall take the integration constant into account directly in the expression 
for the wave function). 

From the definition (39.2) and the expressions (39.10) and (39.15) we 
easily find the wave function of the particle with an accuracy to within terms 
up to the first order in powers of fi/i, for E > U and E < U respectively: 


ex (— foe ax) +e ok exp (-~fotxyar) . (39.16) 
v= exp(— fioa) + Te exp (-—fipcoiax) , 


[lp oi (39.16') 

The character of the wave function obtained depends critically on the 
sign of the difference (E—U). If E>U, then the momentum is real. This 
corresponds to a motion of the particle in the region allowed by classical 
mechanics. In this case the wave function has the character of an oscillatory 
function. The period of oscillation is smaller, the larger the value of the 
momentum p. The factor p ~v? hasa simple meaning. The probability of 
finding the particle in a region from x to x+dx is proportional to the time 
during which the particle is in this region; | Y(x)|? dx ~ v~! dx ~ dz, i.e. the 
same result is obtained as in classical mechanics. The wave function in the 
region of forbidden energies, for E < U, has a completely different character, 
Here the momentum becomes imaginary, and the wave function goes over 
into a sum of exponential expressions. At the point £ = U (called the turning 
point) p =0 and the expression obtained for the wave function is meaning- 
less. 

As is clear from what follows, the quasi-classical approximation becomes 
inapplicable near the turning point. However, without knowing the wave 
function at the turning point one cannot close the wave function at the 
boundary of the allowed and forbidden regions. In other words, one cannot 
determine the constants figuring in the oscillating and exponential expres- 
sions, and without this the quasi-classical wave functions have no practical 
validity. However, before considering the calculation of the wave function at 
the turning point it should be explained why the quasi-classical solution is 
meaningless at this point. For this we estimate the limits of applicability of 
the expressions (39.16) and (39.16'). First of all we note that in substituting 
S= So into eq. (39.3) we have dropped the term (ii/2m) Y?So as negligible. 
For this to be correct the following inequality must be satisfied 


We 





ih 1 
A V°S0| <5 (VSo)? , (39.17) 








§39 LIMITING TRANSITION TO CLASSICAL MECHANICS 149 
or, taking into account that VS = p, 

A\Vp| <p2. (39.18) 
For one measurement the above inequality can be rewritten in the form 


ren 
{dp 2 
ce <p?. (39.19) 





Introducing the wave number k instead of the momentum p and the corres- 
ponding wavelength X =hi/p = k~ 1 we have 


dx 
meet 92 


Thus we see that the Schrödinger equation reduces to eqs. (39.5) and 
(39.6) when the condition (39.20) is satisfied. Namely, for the applicability 
of the quasi-classical approximation it is necessary that the de Broglie wave- 
length should change sufficiently slowly from point to point in space. In 
other words, the relative change of the wave number over the extent of a 
wavelength must be small in comparison with unity xk! |dk/dx| <1. We 
note also that the relative change in the derivative of the wave number k over 
the extent of a de Broglie wavelength must be small. Indeed, in obtaining 
eqs. (39.5) and (39.6) we have also made use of the condition 


AIV?S\| <IV2So1 . (39.21) 


Taking into account that the problem is one-dimensional and introducing 
the wave number k, we rewrite this inequality in the form 


d2k 
dx2 











x<| A (39.22) 


For the motion of a particle in a potential field U(x) it is convenient to 
express the wave number in terms of the potential energy according to the 
formula 


and to write the condition (39.20) in the form 


iby Uh ea (39.23) 
p> dx 





150 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 


or 
LAES Tje it. (39.24) 
h2?! dx 


Making use of (39.22) one can also write the corresponding inequality for 
the second derivative of the function U. Hence it is seen that the quasi- 
classical approximation is valid: 

(1) when the de Broglie wavelength is sufficiently small (i.e. when the particle 
is moving sufficiently quickly; 

(2) for a sufficiently slow change in the potential energy from point to point, 
when no considerable change in the momentum of the particle takes place 
over a length of the order of magnitude of X. 

It becomes clear from (39.23) why the quasi-classical expression for the 
wave function makes no sense at the turning point. Near the turning point the 
momentum of the particle becomes small and the quasi-classical approxima- 
tion becomes inapplicable. 

The formulations ‘sufficiently small’ and ‘sufficiently slow’ in (1) and (2) 
stress the fact that, since the criteria (39.20)—(39.24) involve the mass of the 
particle and the actual characteristics of the field — the quantity dU/dx — 
then for different fields and different particles the quasi-classical approxima- 
tion will be valid for motions with different energies. For qualitative estimates 
one can rewrite (39.20) in a simplified form. Namely, assuming that the 
change in the wavelength takes place in a region of action of the field having 
extent a, one can write instead of (39.20) X <a or 


pan ( 
é 39.25 

2ma? ) 

For a-particles (m = 6.7X 10-74 g) with energy £ = 1 MeV passing through 
an atomic shell (a ~ 1078 cm) the inequality (39.25) is fulfilled to a good 
approximation. On the contrary, for the same a-particles with an energy of 
10 MeV undergoing direct collision with a nucleus (a ~ 10713 cm) the quasi- 
classical consideration is inapplicable. In the region of substantially larger 
energies the application of the quasi-classical approximation turns out to be 
possible in considering certain processes connected with nuclear collisions. 





§40. The solution of the Schrödinger equation near a turning point 


Let us now return to the consideration of the behaviour of the wave func- 
tion near a turning point. 





§40 SOLUTION OF WAVE FUNCTION NEAR A TURNING POINT 151 


The idea here is as follows: since near a turning point the quasi-classical 
approximation turns out to be unsatisfactory, it is necessary to find a solution 
of the Schrödinger equation without making use of this approximation. The 
possibility of obtaining such a solution for an arbitrary form of the potential 
energy is associated with the fact that the expression for the potential energy 
near a turning point makes possible an essential simplification (see below). If 
the solution sought is found, then one has to determine its asymptotic behav- 
iour at large distances from the turning point in both directions, in those 
regions where the quasi-classical approximation is already valid. Requiring 
that the quasi-classical solution be the same as this asymptotic expression, we 
shall be able to determine the corresponding constants. 





Fig. V.13 


To carry out this programme, we note that one can expand the potential 
energy U(x) in the vicinity of the turning point (fig. V.13) in a series with 
respect to the small displacement £ =x —a and retain the linear term of this 
expansion. In this case we assume that at the turning point the curve U(x) is 
smooth, as is shown in fig. V.13. We shall also assume that the region x >a 
extends to infinity. Near the point x =a we can write 


U(x) = Ula) + 52 _ @—a2)+.... (40.1) 


The potential energy at the point a is the same as the total energy of the 
particle U(a)=E. We denote by f the force acting on the particle at the 
turning point, f= —dU/dx|,-,, and introduce the new variable =x —a= 
=(E£-—U)/f. We write the Schrödinger equation near the point x =a as 





152 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 


follows: 


dy, 2m = 2 
de? Fp ty=0. (40.2) 

The Schrödinger equation was considered in such a form in §13. Equation 
(40.2) is the same as eq. (13.5) for E = 0. Consequently, the wave function 
satisfying eq. (40.2) and finite for > + œ is expressed in terms of the Airy 
function. We shall use immediately the asymptotic expressions (13.9) and 
(13.10). This means that we consider values of £ which are sufficiently large 
for the asymptotic expressions to be used, and which are at the same time 
such that expansion (40.1) is still applicable. Such a region, as a rule, exists 
for fields satisfying the quasi-classical conditions. 

Correspondingly, in the region x >a the solution of eq. (40.2) can be 
written in the form 


___2¢ 
(Qmfé)s 


where C is the normalization constant. 
For motion in the field (40.1) the momentum p is of the form 


p = [2m(E — U)}2 = (2mfé)? . (40.4) 


We express the action in terms of the variable £ 





sin (= amps f+ 17) (40.3) 
3h 


x E E 
f pax= f pag= Om) f dg =F Cmr. (40.5) 
a 0 0 
Making use of (40.5) we can write the wave function (40.3) in the form 
x i a 
W(x) = 2Cp-* sin (rm ff pdx + 1) = 2Cp-7 cos (n= J pdx — in). 
a a (40.6) 


We see that the function (40.6) has the quasi-classical form (see (39.16)). 
We now find the function in the region x <a. Again making use of the 
asymptotic expression (13.9) and expressions (40.4), and (40.5), we have 





ge’ —2n-l(2mfE3)3 EC. [am P 
h onpa” eet I pias), 02) 


where C is the same normalization constant as in formula (40.6). Thus we 


§40 SOLUTION OF WAVE FUNCTION NEAR A TURNING POINT 153 


have obtained the expression for the quasi-classical wave function valid on 
both the left and right side of the turning point x =a. 
We finally write the expressions for the quasi-classical wave function: 


a 
Velde exp (-n-1 fier) (x <a), 
V(x) =| fy (40.8) 
| 2Cp-2 cos (a=! f pax—4n) (x >a). 


The constant C is defined by the normalization condition. 

Analogously, if the allowed region lies on the left of the turning point b, 
i.e. U(x)< E for x <b and U(x) > E for x > b (see, for example, fig. V-14 for 
x >a), then the wave function is written in the form 


C'|p|-2 exp (=m if plax) (x>b), 
V(x) = i - G0:9) 
2C'p-? cos (a= f pax—4n) @& <b). 


Thus we have found in the quasi-classical approximation the function p(x) 
satisfying the Schrödinger equation. The solution obtained is still not com- 
plete, since a linear equation of second order has two linearly independent 
solutions. In the case of a wave function depending on one independent 
variable the other solution of the Schrödinger equation can easily be obtained. 
That is, if Y] and y, are two linearly independent functions satisfying the 
one-dimensional Schrödinger equation corresponding to an energy Æ, then 
they are always connected by the relation 


1 dy) 1 dy, 2m 
Vi dx? V2 dx? 72 





(U-E). (40.10) 


Integrating, we obtain 


d d 
ba uee (40.11) 








154 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 


We seek the solution linearly independent of (40.8) in the form 


a 
Bylpt-¥ exp ( 1-1 f wiex) (x <a), 
Vx) = Š (40.12) 
x 
Bap“? cos (m fpa ta) &>a). 
a 


The expressions (40.8) and (40.12) are to be substituted into relation (40.11). 
Then, by virtue of inequality (39.20), it is sufficient to restrict oneself to 
differentiation with respect to the arguments of the exponential and trigono- 
metric functions. Equating the expressions obtained for x <a and x >a, we 
find B, = B3 sin ($7 + a). Finally we set B} = B, a= 47. Thus the solution 
linearly independent of (40.8) can be chosen in the form 


hs zF ( no! f ia) @ <a), 


W(x) = (40.13) 


Xi 
| Bp-? cos (x f pax +47) (x>a). 
a 
Correspondingly, the solution linearly independent of (40.9) is written as 


x 

| B'|p|-2 exp (a=? f plax ) (x«>b), 
b 

W(x) = (40.14) 


b 
Bp cos( n=! f päx+4n) (x <b). 
x 


The expressions obtained will be unsuitable in the case where at the turn- 
ing point, for example at point b, the potential energy becomes infinite in 
a discontinuous way. In this case Y =0 in the region x >b. The phase 
of the wave function for x <b can be determined if the conditions of 
applicability of the quasi-classical approximation (39.20) remain valid up to 
the point x = b. Then, taking into account that Y(b) = 0, we obtain 


b 
W(x) = Ap-? sin (a f vax) : (40.15) 


§41 MOTION IN A POTENTIAL WELL 155 
841. Motion in a potential well in the quasi-classical approximation 


We apply the results obtained to the motion of a particle in a potential 
well. We then find an approximate formula for the energy spectrum. Its 
comparison with accurate formulae will clearly allow us to judge the degree 
of accuracy and the merits of the quasi-classical approximation. At the same 
time the solution of the problem posed is of great interest in another respect. 
It makes it possible to elucidate the connection between quantum mechanics 
and the old Bohr theory. 

Let us consider, first of all, a potential well with infinitely high walls (see 
§8). The wave function in the quasi-classical approximation is given by a 
formula of the type (40.15). Namely, 


¥ 
W(x) = Ap7? sin (a f par) : (41.1) 
q 


In the potential well there will be two turning points a and b at which the 
wave function must reduce to zero. Thus at the two turning points the 
condition y = 0 or 


b 
Ap-? sin (a= f vax) =0 (41.2) 
a 


must be fulfilled. This condition is fulfilled if 
b 
h=! f pdx=nr, (41.3) 
a 


where the n are the positive integers beginning with unity. Since the momen- 
tum is constant and equal to (QmE)z, we find 
9) ) 
ELE (41.4) 
2ml? 
where / = b — a is the width of the well. 

We see that in the simplest case of an infinitely deep potential well the 
quasi-classical approximation leads to an accurate expression for the energy 
spectrum (see §8). 

Let us now consider the general case of a potential well as shown in 
fig. V.14. We assume that the forbidden region extends infinitely to the right 
and left of the turning points. Then the quasi-classical wave function will not 








156 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 





Fig. V.14 


contain any exponentially increasing terms and will be given by formulae of 
the type (40.8) or (40.9). The two wave functions (40.8) and (40.9) describ- 
ing the motion of a particle in a well must be identical. 


x 0 b 
2Cp-i cos (a f px tr] = 2C'p-i cos o= f pax —4r) 
a X. 


(a<x <b). (41.5) 


This is possible only in the case where the sum of the two phases is equal to 
an integer multiple of 7 


b 
a! f pdx—}n=nn, (41.6) 
a 


where n is an integer. Then 
C'=(—1)'C. (41.7) 


If one introduces the integral with respect to the period of the classical 
motion of a particle from a to b and conversely f? pdx =4$fpdx, then from 
(41.6) we obtain 


fpax=2m(n+4). (41.8) 


The above expression is none other than the Bohr quantization rule, from 
which the stationary states of a particle in the quasi-classical case are deter- 
mined. Thus Bohr’s theory with its inconsistent imposition of quantization 
conditions upon purely classical quantities, turns out to be completely valid 
within the limits of the quasi-classical approximation. We note that the 


§41 MOTION IN A POTENTIAL WELL 157 


number 7 is equal to the number of roots of the quasi-classical wave function 
between the turning points a and b, because as x changes from a to b the 
phase of the wave function increases from —47 to (n +4)7 —4n and, conse- 
quently, the cosine reduces to zero n times. The larger the quantum number 
n, ie. the smaller the de Broglie wavelength, the better the conditions of 
applicability of the quasi-classical approximation (39.20). Consequently, we 
expect that the energy levels obtained from condition (41.8) coincide, for 
large values of n, with their exact values as calculated from the solution of the 
Schrödinger equation. However, in some cases, such as, for example, the 
harmonic oscillator, formula (41.8) gives the correct value of the energy level 
for any value of n. The integral on the left-hand side of eq. (41.8) represents 
the area bounded in its phase plane by the classical phase trajectory of a 
particle with energy Æ. According to (41.8), this area is equal to 2m/n for 
n> 1. Since an energy level of the system corresponds to each node of the 
wave function, the number n gives the number of states with energies less 
than or equal to Æ. Thus to each quantum state in the phase plane there 
corresponds an area equal to 27. The number of states corresponding to an 
area Ap Ax in the phase plane will thus be equal to 


Ap Ax 
aah (41.9) 





Generalizing this formula to the three-dimensional case it is obvious that we 
shall obtain the number of states corresponding to the volume 


Ax Ay Az Ap, Ap, Ap, in phase space to be 


Ax Ay Az Ap, Apy Ap, 
Th} i 





(41.10) 


This formula was the basis of our exposition of statistical physics. We see that 
the quasi-classical approximation is like a bridge connecting classical and 
quantum mechanics. It enables one to understand the meaning of Bohr’s 
theory and the correspondence principle, and it niakes it possible to eliminate 
all apparent contradictions between the different aspects of the behaviour of 
real particles. By means of the quasi-classical approximation we can find 
directly the conditions and the degree of accuracy with which one can pass 
over to the classical description of the motion of particles in many problems. 
At the same time it gives a relatively simple method of describing quantum 


systems approximately, such as, in particular, finding energy levels for parti- 
cles with a high energy. 





158 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 


§42. Potential barrier penetration 


In §13 we considered the passage of a microparticle through a rectangular 
potential barrier. In this section we shall obtain more general formulae for 
the case of the passage of particles through potential barriers of arbitrary 
form (fig. V.15). We assume that the energies £ of the particles are sufficient- 
ly large, and that the curve of the potential energy is sufficiently smooth that, 
with the exception only of small regions around the turning points a and b, 
the conditions of applicability of the quasi-classical approximation are every- 
where fulfilled. 








Fig. V.15 


Let a particle moving from left to right along the x-axis fall onto a barrier. 
Then in the region behind the turning point a, i.e. for x >a, there must be 
only a wave propagating in the positive direction of the x-axis (in this region 
there is no reflected wave). The quasi-classical wave function for x >a can be 
written in the form of a superposition of the expressions (40.8) and (40.13) 


X \ x 
W(x) = 2Cp-? cos (m f pdx -4r ) + Bp~? cos (7-1 f pax + ta) i 
a a (42. 1) 
Since at large distances from the point a the momentum p changes little, each 
term of (42.1) is a superposition of two plane waves propagating in opposite 
directions. It is easily seen that the superposition (42.1) describes a wave 
propagating from left to right only under the condition B = — 2Ci. In this case 


x 
W = 2Cp-7 exp (wt f pox—tin) (x>a). (42.2) 


——— 


§42 POTENTIAL BARRIER PENETRATION 159 


Let us find the quasi-classical wave function in the region b <x <a. 
Taking the superposition (40.8) and (40.13) and taking into account the rela- 
tion between B and C, we obtain 


a 
V(x) = cipt-t| exp ( A! f plax) + 
> 


a 
2i-lexp (a= f wiax) | (b<x<a) (423) 
x 


This relation is conveniently rewritten in the form 


væ) = Cipit | e~Lexp (1! J wia) - 
b 


— 2ie! exp (—n- J wiax)| (b<x<a) (42.4) 
b 


where 


a 
=A | f Ipldx. 
b 


Making use now of the relations (40.9) and (40.14), one easily obtains the 
expression for the wave function in the region in front of the barrier 


Yæ) = p| i= tet cos (x no} Jeaan) + 
b 
te-Leos(n a es n)| = 
b 
=2i-!Cp-2 £ (et de!) exp ( ï- l f pax —din) + 
x 


. l 
+ (eb +4eLyexp ( -051 f pax + 4in)| (<b). (42.5) 
x 








160 THE QUASI-CLASSICAL APPROXIMATION Ch. 5 


Thus we have written the wave function in the region in front of the 
barrier in the form of a superposition of incident and reflected waves. We 
now determine the transmission coefficient, D, of particles through the 
barrier. Making use of (42.2), we calculate the current density of particles 
transmitted through the barrier ji 

je=41Cl2m-!. (42.6) 
The current density of incident particles, in correspondence with (42.5), is 
equal to 

Tine = 41E|2m= Nek + ze-£)? . (42.7) 


Consequently, we obtain the following expression for the transmission coeffi- 
cient D: 


ih —2L 
=E. (42.8) 
Jine (1 +4e72L)2 
For a sufficiently wide potential barrier gA < 1, we obtain 
D 
D=e-2L = exp (20-1 f pax) j (42.9) 
a 


We note that if the potential energy on one side of the barrier changes 
rapidly enough, so that the quasi-classical approximation is inapplicable, then 
in the expression (42.9) a factor will appear before the exponential function. 
However, the basic exponential factor does not change. Formula (42.9) is 
widely used for the calculation of the probabilities of transmission of particles 
through potential barriers. 

As an example let us consider the theory of a-decay. It is well known that 
all heavy nuclei with mass numbers of the order of magnitude of 200 turn out 
to be unstable with respect to a-decay. The probability of decay strongly 
depends on the energy of the emitted a-particles and varies over a very wide 
range. Thus if the decay probability is characterized by a half-life 7, then 7 is 
equal to 1.6X10~4 sec in the case of 234Po which emits a-particles with an 
energy of 7.8 MeV, whereas 7 is equal to 1.4X 1019 years in the case of 232Th 
which emits a-particles with an energy of 4 MeV. Such a strong dependence 
of the probability of a-decay on the energy is accounted for by the fact that 
the particle, in order to get out of the nucleus, must pass through a potential 
barrier*. Indeed, simplifying the treatment, we can assume that the initial 


* R.Gurney and E.Condon, Nature 122 (1928) 439; Phys. Rev. 33 (1929) 127. 


§42 POTENTIAL BARRIER PENETRATION 161 


nucleus already contains an a-particle. Then the problem reduces to the calcu- 
lation of the probability that the a-particle will leave the initial, parent 
nucleus. 

We denote by U(r) the energy of interaction of the a-particle with the 
remaining daughter nucleus. At small distances U(r) amounts to the potential 
of nuclear forces which we shall consider constant and equal to Up, while at 
large distances it is just the Coulomb interaction between the a-particle and 
the remaining nucleus (see fig. V.16). 


2Ze2/r (r>ro), 
Uo (r< <ro), 


where rg is a distance of the order of magnitude of the nuclear size. 


U(r) = 








Fig. V.16 


Making use of formula (42.9), we can find the probability of the a-particle 
passing through the potential barrier. The fact that formula (42.9) is derived 
for one-dimensional motion is of no importance here, because we have 
established in §35 that a radial motion is equivalent to a one-dimensional 
motion with a certain effective potential energy. 

For simplicity we consider the case where /=0, so that the centrifugal 


energy is not involved in the calculations (see also the next section). We then 
have 


oee- 2h- I (22 - OREZ (42.10) 


where the turning point a is determined by the condition a = 2Ze2/E, and p is 
the reduced mass of the a-particle and the daughter nucleus. We calculate the 


| 








162 THE QUASI-CLASSICAL APPROXIMATION Ch. S 


integral L: 
DV A } 
Ls (“22 ) if (: = 7) rtar ; 
A2 F Umax 7o 
o 
where Umax = 2Ze?/ro. On substituting (Er/Ummax0)? = sin a, we easily obtain 


2 
L= e — sin 2ag) , (42.11) 





where sin ag = (E/U max)? and v = (2E/u) is the velocity of the a-particle, if 
the difference between the reduced mass u and the mass of the a-particle is 
neglected. Thus the probability for the a-particle to pass through the barrier 
is given by the expression 





2 
DEE [- ae (1 —2a9—sin 229) | (42.12) 


We obtain the probability A for a-decay if we multiply the probability D of 
passing through the barrier by the probability for a-decay in the absence of 
the barrier, à = vD. The quantity v cannot be calculated with any accuracy. It 
is significant, however, that the very strong dependence of the decay probab- 
ility on the energy of the a-particle is involved in the factor D. Qualitatively 
this dependence is well confirmed by experiment. 

Finally we note that similar reasoning is also applicable in the case of 
spontaneous fission of heavy nuclei. 


§43. Quasi-classical motion in a centrally symmetric field 


Let us find an approximate expression for the radial component of the 
wave function R(r) or the function x(r)=rR if the potential energy U(r) 
satisfies the condition for the quasi-classical approximation. We can make use 
of the relations already derived, since the function x(r),as we know (see §35), 
is described by the one-dimensional Schrodinger wave equation with the 
effective potential energy 


h+) 
Ui 
UOO 


However, in this case, of course, the fact that the coordinate r, as distinct 
from x, varies from O to œ must be taken into account. For / = 0 we have 
Uerfr) = U(r). If the quasi-classical conditions are fulfilled up to the point 


o E i= 


§43 MOTION IN CENTRALLY SYMMETRIC FIELD 163 


r=0, then the corresponding wave function can easily be obtained. Indeed, 
the condition for the finiteness of the wave function at zero gives X(0) = 0 
and, making use of (40.15), we obtain 


7. 
x(r) = C[2m(E — U)? sin (a=? f [2m(E — U(r))]? ar) : (43.1) 
0 


In the more general case /# 0 the effective potential energy Ugg) must 
satisfy the quasi-classical condition. If it is assumed that at small distances the 
centrifugal energy 72/(/ + 1)/2mr plays the basic role and that p ~ fil/r, then 
it follows from the condition (39.20) that /> 1. The corresponding wave 
function for r>a, where a is the turning point, can be written in the form 
(40.8) with the condition, however, that in the centrifugal energy the quanti- 
ty I(l + 1) is replaced by* (/ +4)?: 


r 
x0) = apr c0s (a=! if p,ar— 4) (43.2) 
a 
where 
2 2 
P= [ 2m(£- vo- neU +A) J (43.3) 
mr? 


Thus the term 72/8r2 is added to the centrifugal energy. This addition leads 
to a more correct value of the phase of the wave function. Thus for free 
motion, U=0, formula (43.2) gives for the phase of the wave function at 
large distances the value which we have obtained in §35**. 


* H.A. Kramers, Z. Phys. 39 (1926) 828. 


** See, for example, L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon 
Press, Oxford, 1965). 











The Matrix Form of Quantum Mechanics 


§44. Operators and matrices 


The mathematical apparatus of quantum mechanics developed in the 
preceding chapters (the method of linear Hermitian operators) is not the only 
mathematical apparatus used in quantum mechanics. It turns out that all 
mechanical quantities in quantum mechanics can be related to so-called 
Hermitian matrices as well as to operators. A matrix R is understood to be 
the whole set of quantities forming the table 


Ry Ry2---Rin 
Ry, R2- -Ran 
Ra ; (44.1) 


Rn Rp2 Dg -Run 


The number of rows and columns in the table need not, in the general case, 
be the same. Each of the quantities (generally speaking complex) appearing 
in the table is called a matrix element. A matrix element has two indices: the 


164 





§44 OPERATORS AND MATRICES 165 


first denotes the ordinal number of the row, and the second denotes the 
ordinal number of the column. The concept of matrices is usually introduced 
in connection with the linear transformation of vectors in an n-dimensional 
space*. 

We shall see below that in quantum mechanics it is possible to make a 
geometrical interpretation of the wave function as a vector in a certain imagi- 
nary space. In the meantime we shall convince ourselves by means of very 
general reasoning of the fact that any linear operator Ê can be related to a 
matrix F with definite values of the matrix elements. 

The definition of an operator means that the result of its action on the 
function W(x) is given by 


RY(x) =y). (44.2) 


We pass over from the x-representation to the F-representation. For this 
we expand the functions W(x) and y(x) in terms of the eigenfunctions Y,„(x) 
of the operator F. We assume that the operator Ê has a discrete spectrum. For 
example, such a representation can be the energy representation (£-represen- 
tation) 


V= Cm Vm) 9X) = 27 b, W(X) - (44.3) 
m n 
The whole set of amplitudes c,, (or b„) determines the wave function y 


(or y) in the F-representation. Sometimes this set is conveniently denoted in 
the form of a column 


c] by 


C2 bz 


l 


We substitute the expansion (44.3) into (44.2), and obtain 


(44.4) 


2 bn Vy») Fa 2 CmR Wm) X 
n m 
Multiplying the left-hand and right-hand sides of this equality by Y} (Œ) and 


* For more details see for example V.I.Smirnov, A course of higher mathematics 
(Pergamon Press, Oxford, 1964). 


Í 
ll 
Í 








166 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


integrating over the entire region of variation of the independent variables, we 
find 


b)= 2 Rin€m > (44.5) 
where 
Rim = [ VIGIRY»@) AV . (44.6) 


The relation (44.5) determines directly the transformation of the function y 
into the function y in the F-representation under the action of the operator 
Ê. The operator R in this representation is given by formula (44.6), i.e. in the 
form of a matrix. Thus the definition of the matrix R is equivalent to the 
definition of the operator R itself. 

The matrix element Ry is sometimes called the matrix element corres- 
ponding to the transition from the kth state into the ith state. Such a termin- 
ology is based on the following reasoning. We assume that the initial state of 
the system is the kth state, W(x) = W,(x). Under the action of the operator R 
the transformation (44.2) takes place. Making use of (44.3) and (44.5) and 
taking into account that in the given case c,, = ông bn = Rnko we obtain 


gx) =Rv z 2 bnYn= 2 RnkYn&), (44.7) 


and, consequently, the square of the modulus of the matrix element Ry 
determines the probability of finding the system in the ith state. 

Knowing the matrix corresponding to the quantity R, one can also easily 
find the mean value of this quantity in a certain state Y. According to the 
general formula (22.4) we have 


R= fw" Ryav. 
Substituting here the expansion (44.3) instead of Y, we obtain 
R=D D chen [Unn V= 2 D cmRmnen: (44.8) 
mon m n 
We note that if we determine the matrix elements (44.6) by means of the 


wave functions Y, which are the eigenfunctions of the operator R, then we 
also determine the matrix R of the operator in its own representation 


Rmi= | Vn RV AV = Rif Yh dV =R ôm- (44.9) 


We see that in this case only the matrix elements with m = l are different from 


eee 


§44 OPERATORS AND MATRICES 167 


zero. Matrices of such a form are called diagonal 


Raos 0 
0 R he Ae 

R= aa ; (44.10) 
Oh . 0! Ra; 


Thus in its own representation any operator is described by a diagonal matrix, 
and the diagonal elements are equal to the eigenvalues of this operator. 

The definition of the matrix elements R,,,, is completely equivalent to the 
definition of the operator R. As we shall see, it enables one to determine the 
eigenvalues and eigenfunctions of this operator. On the other hand, if the 
operator R is known, then also the matrix elements R mı can be determined. 

The Hermitian character of the operator R imposes a certain restriction 
upon the form of the matrix elements R,,,,. Namely, for the matrix element 


Rj,» the complex conjugate of the element R np We have 


~ p** 
Rmi= (Svnkvar) = [Unk Ydy. 


By definition of the Hermitian operator 
[Ym vrav = [VTRVn AV , 
so that 
I Rim g 


We see that the Hermitian property of matrix elements 


Rim = Rm > (44.11) 


follows from the requirement of the Hermitian character of the operator. 

Every physical quantity in quantum mechanics, as well as the Hermitian 
operator R, can be related to a Hermitian matrix R whose matrix elements 
are determined by formula (44.6). 

As we shall see below, the matrix form of quantum mechanics is, in 
certain cases, more convenient than the operator form. The representation of 
quantum mechanics in the matrix form will enable us to formulate the equa- 
tions of quantum mechanics like the equations of classical physics. The wave 
function will no longer appear in them. The equations in this form will 
coincide with those of classical mechanics, with only the basic difference that 





168 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


in these equations classical quantities will be replaced by the corresponding 
matrices. However, before passing over to a systematic exposition of quantum 


mechanics in the matrix form it is necessary to present the basic notions of 


matrix calculus. 


§45. The fundamentals of matrix calculus 


In the preceding section we have defined an arbitrary matrix R as the 
whole set of quantities R,,, arranged in a definite order in the form of the 
table 


Ry Ri 
Ro, Rap 

R=| . A 5 (45.1) 
Rn Rn2 OG Rin 


Matrices in which the number of columns is equal to the number of rows 
are called square matrices. A matrix can be finite, if the number of columns 
and the number of rows is finite, as well as infinite if the number is arbitrarily 
large. The matrix elements R41, R22 ..-, Ry, =- forming the diagonal of the 
matrix are called the diagonal elements. The matrix for which all elements are 
equal to zero is called the null matrix O. The matrix for which all the 
diagonal elements are equal to unity is called the unit matrix. We shall denote 
this matrix by 


i O @ 
OAO 
La 0 1 > (45.2) 


i.e. (1)mn= mn: A matrix can be considered as a certain hypercomplex 
number, just as the set of two numbers a and b can be treated as one complex 
number a +ib. As in the case of complex numbers, one can construct an 





§45 THE FUNDAMENTALS OF MATRIX CALCULUS 169 


algebra of hypercomplex numbers (matrices) by defining the action of addi- 
tion and multiplication of matrices. 

The matrix R, each element of which is equal to the sum of the corres- 
ponding matrix elements of the matrices F and D, 


R=F+t+D, 
if (45.3) 
Rmi = Fmi 3 Dini , 
is called the sum of the two matrices F and D. One can add or subtract only 
matrices having the same number of rows and columns. Two matrices F and D 


are equal to each other if the corresponding matrix elements are equal to each 
other: 


F=D,; 
if (45.4) 
F,,,,;=D 


ml ml: 


Further we define the product of the number k and the matrix D as a 


matrix F each element of which is equal to the product of the number k and 
the corresponding matrix element of the matrix D: 


F=kD, 
= (45.5) 
Fmi = KD mi- 
The matrix ZL is called the product of the matrices F and D 
L=FD, 
if the matrix element L, is equal to 
Linn = 2 F mP - (45.6) 


This means that each element of the matrix L is equal to the sum of the 
products of the elements of the mth row of the matrix F and the elements of 
the nth column of the matrix D. 

The matrix F can be multiplied by the matrix D only in the case where the 
number of columns of the matrix F is equal to the number of rows of the 
matrix D. We stress that the product of matrices, as the product of operators, 
is non-commuting, i.e. in the general case 


FD#DF. 





Hii 


Se aiii mae eee 





170 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


If the unit matrix (45.2) is taken as one of the factors, then we arrive at 
the equalities 
= jane (45.7) 


i.e. multiplication by the unit matrix is commutative. We note that when two 
matrices with matrix elements different from zero are multiplied a null matrix 
can be obtained. Let, for example, 


ate eo; 


0 0 
L=FD= ( )=o 
0 0 


On the other hand 


Then 


0 0 
L'=DF= ( ) #0. 
2 W 


Analogously to (45.6) one can form the product of three and more 
matrices. Thus if FD = L, then the product RFD is equal to 


(RED) nn = 2 R milin F 2, 2 R miFpPpn s (45.8) 


We arrive at the exactly the same expression if we take the product of the 
matrix RF and the matrix D. Thus for the matrix product the associative law 


(RF)D = R(FD) . (45.9) 
holds. 
It is easily shown that the distributive law 
R(E + D) = RF + RD (45.10) 


is also valid. 

The definition of the basic actions of matrices, given by formulae 
(45.3)—(45.6), corresponds completely to the analogous relations for linear 
operators. For formulae (45.3)—(45.5) this statement is obvious. One can 
easily verify this correspondence also for formula (45. 6). Let the operator Ê 
be equal to the product of the operators Ê and D, ie. L = ED. We find the 
matrix corresponding to this operator in an arbitrary representation. We 


EN ee EES 


§45 THE FUNDAMENTALS OF MATRIX CALCULUS 171 


assume that the operator D transforms the function y into the function y, 
and the operator Ê correspondingly transforms y into x, so that 


y=Dy, x=. (45.11) 


We rewrite these equations, expanding the wave functions yY, y and x ina 
series in terms of a certain system of functions Y „. Let the expansions 


Y= cnYn; o= È bii; x= È dewr: 
n C 
hold. Comparing with (44.2) and (44.5), we obtain 


b= 2 Drey an= 2 F nibi 
Substituting b, into the second of these equations, we have 

dm = 2 2 F miDinEn 2 (45.12) 
On the other hand, taking into account that x= Ly, we can write that 

Gh 2 De (45.13) 


Comparing (45.13) with (45.12), we arrive at eq. (45.6). 

If a matrix F has an unequal number of columns and rows, then, by 
striking off a certain number of columns or rows, one obtains a square matrix 
with an equal number of rows and columns. From this table one can calculate 
the determinant det F of the matrix F. 

The highest possible order of this determinant is obtained when the mini- 
mum number of rows or columns is struck off. The highest order of the non- 
zero determinant obtained from the matrix is called the rank of the matrix. 

In certain applications the sum of the diagonal elements of the matrix, 
often called the trace, plays an important role. By definition 


Wear 2 Fyn g 
n 


The matrix F is called non-singular if one can construct the inverse matrix. 
The inverse matrix, which we shall denote by F~}, satisfies the equations 


FRo1=)).  F-lp=1. (45.14) 








172 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


To find the elements of the matrix F~! it is necessary to find the solution of 
the system of linear homogeneous equations which is obtained from the defi- 
nition (45.14): 


23 (E)mk E Dyn z Dya > 2 Gis Diin = 8 nn (45.15) 


for all possible values 7 and n. 
The system of equations (45.15) can be solved only in the case where the 
determinant, det F, of the matrix F differs from zero. (It is assumed that the 


matrix F is a square matrix.) 
Making use of (45.14) it is easy to find the matrix which is the inverse of 


the product of matrices RFD... (if it exists) 
(RFD...)~!=...D~F-1R-!, (45.16) 


Further we define the conjugate matrix FÌ (or the Hermitian conjugate) of 
the original matrix F 


(FY) nn = Cal) o (45.17) 


We denote the complex conjugate matrix element (Fpp) by Fp- 
It follows directly from the definition (45.17) that 


(F+D)t=Ft+pt. (45.18) 


It is also easy to define the matrix conjugate to the product of matrices. Thus, 
if L = FD, then 


(a= lyre = D EDEM = 2 DD KF ken (45.19) 


and, consequently, 


(ED)i = DiFi . (45.20) 
In particular, if L = kD, where k is a number, then 
Li Srey ș (45.21) 


Of course, the relation (45.20) is immediately generalized to the product of 
any number of matrices 


(FDR....)' = ...R? DIF? . (45.22) 


§45 THE FUNDAMENTALS OF MATRIX CALCULUS 173 


If the matrix F is the same as its conjugate matrix Ft, F= FT, then it is 
Hermitian or self-adjoint. This definition is analogous to the definition of the 
hermitician property of an operator (see (44.11)). For the matrix elements of 
a Hermitian matrix we have 


Fam = a) a p 


mn ` 


(45.23) 


The matrix F is called unitary if Ft =F-!1. This condition can also be 
rewritten as follows: 


FIP=FFi=1. (45.24) 
If the matrices F and D are unitary, then their product is also unitary. Indeed, 
(FD) = DiFt = D-!F-1=(Fp)-!. (45.25) 


We shall also encounter the simplest functions of matrices, where the defi- 
nition of a function of the matrix means the definition of the law according 
to which one matrix is compared to another. Since the rules for the multipli- 
cation and addition of matrices are defined, one can easily introduce the 
notion of an integer rational function f(D) of the matrix D. Furthermore, we 
shall also deal with more complex functions, for example of the form e?, 
where D is a matrix. The function e? is understood to be the following series: 


eP=14+D+}D2+..+5D"+..., (45.26) 


We shall show, for example, that the matrix R, defined by the function eiF, 
R =e where F is an arbitrary Hermitian matrix, is unitary. By a direct calcu- 
lation it is easily checked that the matrix R~!, the inverse of the matrix R, is 
the matrix e- İF. On the other hand, for the matrix conjugate to R we have 


Rt 


I 


i t 
(: TE IP — = F3 + B) = 


(1 -iF -42+ F? + s) =e iF 


(because Ft = F). Thus Rt = R~! and, consequently, the matrix R is unitary. 
The rules of action formulated in this section also remain valid for matrices 
with an infinitely large number of columns and rows, provided all sums 
(45.6), are convergent. 
As we have already mentioned, the introduction of matrices is closely 
associated with the concept of the linear transformation of an n-dimensional 
vector. The concept of an n-dimensional vector is a natural generalization of 


174 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


the ordinary concept of a vector. The vector x in n-dimensional space is 
defined by the whole set of n, in general complex, numbers which are called 
the components of this vector x}, x, ...,X,,. Each of the components can be 
represented by a segment on one of the n mutually perpendicular axes in 
n-dimensional space. It is hardly necessary to mention that an n-dimensional 
space is not associated with physical reality, and that a vector in n-dimen- 
sional space is a mathematical generalization. As in the case of ordinary 
vectors, One can introduce the concept of the scalar product of two vectors x 
and y in n-dimensional space. Namely, the scalar product of the vectors y and 
x is defined as 


n 
(x-y) = 2 xX Yi- (45.27) 
i= 
If the vector x has real components, then the definition (45.27) is the same as 
the ordinary definition of the scalar product. On the contrary, if the compo- 
nents of one or both of the vectors x and y are complex, then the definition 


(45.27) leads to new important consequences. 
We form the scalar product of the vector x and the same vector x 


n n 
(x-x) = 2 x}x)= 2 [x;I2. (45.28) 
i= 1= 


It represents a generalization of the concept of the square of a vector to the 
case of complex values of the components. The quantity 


( 2 we)’ (45.29) 


is said to be the length or norm of the vector. The scalar product of two 
vectors in n-dimensional space is, obviously, non-commuting. 


n n 
(x-y)= 2 xi V;# (y-x)= 2 Yxi: 


If the scalar product of two vectors is equal to zero (x-y)=0, then such 
vectors are called mutually orthogonal. 

Let us consider two systems of coordinates k and k’ each with mutually 
orthogonal axes. Let the components of a certain vector in the system k be 
xp and in the system k’ be xi As in three-dimensional Euclidean geometry, 





| §45 THE FUNDAMENTALS OF MATRIX CALCULUS 175 


there is a linear relation between the components expressed by 


' 
X= Dy GipXy - (45.30) 
k 
The whole set of numbers a;, forms the matrix 
ai] 412 + Gin 
Cha 822 co an 
lal =| ` allie 
ani n2 ++ ann 


which is called the linear transformation matrix. 
For the linear orthogonal transformation (45.30) the following condition 
holds: 


n n 


(x-x) = 2 lx; = (x'. x’) = 2 Ixil2. (45.31) 


The transformation (45.30), satisfying the requirement (45.31), i.e. leaving 
the square of the length of the vector unchanged is also called a unitary 
transformation. 

The concept of a vector with complex values of the components in an 
n-dimensional space can be generalized directly to the case of a space with an 
infinite number of dimensions, n >œ., A space with an infinite number of 
dimensions, for which the definition (45.28) of the square of the length of a 
vector is valid, is called a Hilbert space. A vector in a Hilbert space has an 
infinite number of components each of which can be real as well as complex. 

Vectors in an n-dimensional space (with a finite as well as an infinite 
number of dimensions) are often written in the form of a matrix 


X] 
x2 


X=) X3 


Then the transformation (45.30) can be written in the form x’ = ax. Indeed, 
for the components we have x; = a;,x,, which is the same as (45.30). 





176 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


§46. Geometric interpretation of the wave function. 
Canonical transformations 


The mathematical apparatus given briefly above, in particular the vector 
calculus in Hilbert space, in spite of its unusual and abstract character turns 
out to correspond exactly to the quantum-mechanical description of the 
properties of microsystems. We shall consider the wave function y character- 
izing the state of the system as a vector W in a Hilbert space with an infinite 
number of dimensions. To each quantum-mechanical quantity F character- 
izing a property of the system there corresponds a definite system of coordi- 
nate axes or, what is the same, a system of unit basis vectors YŒ), W(x), 
-> W,,(X), .... This system of basis vectors (basis functions) is none other than 
the system of eigenfunctions of the operator F which correspond to the 
possible eigenvalues F}, F2, ... (we assume that the spectrum is discrete; the 
generalization to a continuous spectrum is given farther). The components of 
the vector W in the chosen system of coordinates will be the amplitudes 
C1, C9, -.. €, defined by the relation 


W(x) = 2 CE, (46.1) 


The amplitudes c,, as we know (see §19), are equal to 
cea [WEVE . 


On the other hand, this equality can be considered as the scalar product of 
the vector W and the vector Yg 


ek= Wp Y= f EVEN. (46.2) 


The definition (46.2) corresponds to (45.27). Thus if we have two vectors 
y(x) and W(x) with components b; and c, respectively, then the scalar 
product of the vector Y and the vector y is equal to 


= È beg Joeova. (46.3) 


As we know, the whole set of amplitudes c, represents the wave function in 
the F-representation. Thus the whole set of components of the vector w in 
the coordinate system whose unit vectors are the eigenfunctions w,, of the 
operator F is the wave function in the F-representation. 

The system of basis vectors Y4, Wo, ... is a system of unit vectors which are 
mutually orthogonal. This follows from the condition of normalization and 


pan 


J meaner 


§46 GEOMETRIC INTERPRETATION OF WAVE FUNCTION 177 


orthogonality of the eigenfunctions of the operator Ê (see §18) 
[vie Yn) dV = Vk j Ym = ôkm = (46.4) 


Let us now consider the transition from one representation to another. 
For example, the transition from the representation in which the matrix F is 
diagonal (F-representation) to the representation in which the matrix D is 
diagonal (D-representation). Geometrically this means the transition from the 
coordinate system formed by the basis vectors W,, to the coordinate system 
formed by the basis vectors y. The functions Y, and yẹ are the eigenfunc- 
tions of the operators Ê and Ô respectively. We obtain the transformation 
formulae if we expand the functions y% in terms of the system of basis func- 
tions Y, (assuming that there is a discrete spectrum) 


yk= u Sav). (46.5) 
It is obvious that 


S= [VI oC) AV = Vro - (46.6) 


We shall call the matrix S the matrix of the transformation from one 
representation to another (or, correspondingly, from one coordinate system 
to another). We can obtain definite conclusions concerning the properties of 
the matrix S immediately if we take into account that the system of functions 


9x as well as the system of functions Y, is a system of normalized orthogonal 
functions. Consequently, 


fone dV = 2 Sin Sikil = 2 SinSik = 2 (ST) eSix = ô mk > (46.7) 
or in matrix form 


SSA, (46.8) 


Carrying out the inverse expansion of the functions y,,, in terms of the func- 
tions y% it is easy to obtain 


SSt=1. (46.9) 


It follows from eqs. (46.8) and (46.9) that the matrix S is unitary, 
St =S-!. The transformation from one representation to another, carried 
out by the unitary matrix S, is called a unitary or canonical transformation. 
Geometrically it corresponds to a ‘rotation’ in the Hilbert space. It is also 
easy to obtain the direct relation between the components of an arbitrary 





178 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


vector w in different coordinate systems. Let 
yas 2 c= 2 CEPR - 


Making use of (46.5), we have 
2 y= 2 ckSkY1- 


Equating the expressions at equal Y, we obtain 
c= 2 ga (46.10) 


This expression can be rewritten in the form of a matrix equation, if the 
whole sets of amplitudes c, and c% are considered as single-column matrices c 
and c’. Then 
c=Sc'. (46.11) 
Multiplying from the left by St, we obtain 
c= Sie. (46.12) 


Further let us find how an arbitrary matrix R is transformed in the transi- 
tion to another representation. We assume that in the F-representation the 


following relation holds: 
bı= 2 Rimm (46.13) 


or in the matrix form 
b=Re. (46.14) 


In the transition to another representation the amplitudes c and b are 
transformed into the amplitudes c’ and b’ according to (46.11) and (46.12). 
We make use of the relation (46.11) and express c and b in eq. (46.14) in 
terms of c’ and b’. We then obtain 


Sb'=RSc' . 
Multiplying this equation from the left by St, we find 
b' =StRSc' =R'c'. 
Thus the matrix R’, i.e. the matrix R in the new representation, has the form 


R'=SiRS 


§46 GEOMETRIC INTERPRETATION OF WAVE FUNCTION 179 


or 


(Oran = 2 (st mk RES n 3 (46.15) 


Let us consider certain properties of the unitary transformation. We shall 
show, first of all, that if a matrix D is Hermitian in one representation, then it 
will also be Hermitian in another representation. Indeed, according to (46.15) 
and (45.22) 


D'=StDS, 
(D'yt = Sit Di(st)i =SiDİS. 


Since Dt =D, it turns out that D'=D't. The unitary transformation also 
conserves the form of matrix equations. Let us, for example, consider the 
equations 


F+D=R; PL= Te 


Multiplying the equations from the left by St and from the right by S and 
making use of (45.9) and (46.9), we obtain 


STFS +S*DS=SiRS , 
Si PSS*LS = STTS . 
Using (46.15), we rewrite these equations in the new representation 
F'+D'=R', 
PADa = Ti: 
We see that the form of the equations has not changed. We shall show also 
that in a unitary transformation the trace of the matrix does not change 


Te’ = D Fig = 2 (SHnF kSkn= 
n n,l,k 
= a Fk 2 SknlS u= 2 Fipôkı= Tr F . (46.16) 


Here we have made use of (46.9). 

The unitary transformation also conserves the determinant of the matrix. 
Indeed, since the determinant of a product of matrices is equal to the product 
of the determinants det FDR = det F det D det R, we have 


det F' = det SÌ det F det S = det F det (SİS) = det F . (46.17) 


180 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


The modulus of the square of the determinant of a finite unitary matrix is 
equal to unity. We show this 


Idet S|? = (det S) (det S)* , 


but (det S)* =detSt, because the determinant does not change when the 
matrix is transposed. Consequently, we obtain 


|det S|? = det S det Sİ = det (SS) = 1 . (46.18) 


§47. The eigenfunctions and eigenvalues of an operator given in matrix form 


Let us assume that the operator D in the F-representation is given in the 
form of a matrix D. We shall see how one can find the eigenfunctions and 
eigenvalues of this operator. The equation for the eigenfunctions and eigen- 
values in the F-representation has the form 


2 Dnk? = Dph? . (47.1) 


Here D,, is the nth eigenvalue of the matrix D, and the whole set of ampli- 
tudes cf, c99, ... is the eigenfunction of the operator D in the F-representa- 
tion corresponding to the mth eigenvalue. If the eigenfunction is written in 
the form of a matrix with one column, c), then eq. (47.1) can be rewritten 
in the form 


DeM=D,c'™, (47.2) 
It is easily seen that the magnitudes of the eigenvalues do not depend on 


the choice of the representation. Indeed, eq. (47.1) in another-representation 
is written, according to (46.15) and (46.11), as follows: 


DADE RA) (47.3) 
where D'=StDS, c' = St c. Substituting these values into (47.3), we obtain 
StDSStc™ = Di Stc . 


Multiplying from the left by S, we again arrive at eq. (47.2), whence it is seen 
that Dp = D,,. 

Thus the problem of finding the eigenvalues of a matrix D reduces to 
finding a unitary transformation which will bring the matrix D into diagonal 
form. The diagonal elements of such a matrix are, as we know (see § 44), its 


§47 EIGENFUNCTIONS AND EIGENVALUES 181 


eigenvalues. Thus if S is the required unitary transformation, then 


STDS = D' (47.4) 
or, multiplying on the left by S, 

DS = SD'. 
Taking into account that (D’),,,,, = Dnômn» we obtain 

2 Dink Skn a Dy Smn (47.5) 
or 

2 @ mk F Dpnôkm)Skn =0. (47.6) 


In these equations the matrix elements S,,, as well as the eigenvalues D,, 
are unknown. If D is a square matrix having NV columns and as many rows as 
columns, then for each D,, we have a system of N equations (for m= 


=1,2,...,N). Since the system consists of homogeneous linear equations, it 
has a non-trivial solution on condition that its determinant reduces to zero 


Dy, Da2—D, -- Don 
det| =0. (47.7) 
Dn Dy2 -- Dyn-Dy 


This is an equation of the Nth power with respect to the unknown D,,. On 
solving it we find NV roots which will be the eigenvalues of the matrix D. In 
particular, certain values can be equal to each other; then degeneracy occurs. 
All eigenvalues of a Hermitian matrix D will be real. Indeed, the matrix D' is 
also Hermitian (see §46) and, consequently, (D’),,,,=(D');,, or Dp =D}. 
Substituting the values D4, D2, ..., Dy into the system (47.6), we determine 
the whole set of matrix elements Spn (Sin S2 ..-) for each D,,, i.e. in the end 
we determine the unitary transformation matrix S. Comparing (47.5) with 
(47.1), we see that each column 


Sin 
Son 


Syn 


182 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


of the matrix S is an eigenfunction of the operator D in the #-representation 
corresponding to a given eigenvalue D,,. Knowing the matrix S, we can find 
the whole set of eigenfunctions of the Operator Ô in the x- -representation. 
Indeed, if in the initial F representation the whole set W(x), W(x), ... of 
eigenfunctions of the operator F were the basis functions, then in Ter new 
representation in which the matrix D is diagonal (D-representation) the eigen- 
functions y(x), y2), ... of the operator D will be the basis functions. The 
relation between them is given by formula (46.5) which determines the func- 


tions 9, (x) 
vx) = 2 Spe) - (47.8) 


If we have two matrices F and D, then by means of one and the same 
unitary transformation S they can be brought into diagonal form simultan- 
eously only in the case where they commute, i.e. FD = DF. Indeed, suppose 
that F and D are brought into diagonal form, F’ and D’ respectively. We form 
the matrix F'D': 


(FD) nn = 2) EDRN Lr ano mn =(D 'Ri Jmn A (47.9) 
Consequently, 
F'D'=D'F'. 


Since in the unitary transformation the form of the matrix equation does not 
change (see §46), then in the initial representation we have FD = DF. 

Thus we have proved that commutation of matrices is necessary in order 
that they may be brought simultaneously into diagonal form. It is easy to 
show that this condition is also sufficient. 

In this section we have assumed everywhere that we have been dealing 
with finite matrices. However, if the number N of columns and rows tends to 
infinity, then the mathematical problem becomes substantially more compli- 
cated. The system (47.6) will now be a system of an infinitely large number 
of equations. Equation (47.7) will also be of an infinitely high power. It can 
be shown, however, that in this case also any Hermitian matrix may be 
brought into a diagonal form with real eigenvalues by means of a unitary trans- 
formation. We shall not give the proof of this statement. 


§48 CONTINUOUS MATRICES. DIRAC NOTATION 183 
§48. Continuous matrices. The Dirac notation 


Up to now we have assumed that variables run over a discrete sequence of 
values. It is clear, however, that the preceding results must be generalized to 
the case of continuous variables. 

It turns out that this generalization can be carried out directly. All the 
results obtained above remain valid if all the sums in them are replaced by the 
corresponding integrals. For example, formula (45.6) for the matrix element 
of the product of two matrices will now have the form 


E=FD 


Loge f FecyP yp - (48.1) 


The integration is carried out over the entire region of variation of the corres- 
ponding variable. The unit matrix | is now defined by the equality 


(Dog = 5(a@—B) , (48.2) 


ie. is replaced by the 5-function. As is easily seen the following relation 
holds: for an arbitrary matrix F 


JS Vereen 


The formulae of §3, which express the transformation of the wave func- 
tion from the coordinate representation to the momentum representation and 
vice versa, can also be written in matrix form. 

We note, first of all, that in its own representation the coordinate q must 
be expressed by a diagonal matrix (see §44). Confining ourselves for simplici- 
ty to the one-dimensional case, we have in correspondence with (48.2) 


Axx! = xô(x— x’). (48.3) 


Let us find the matrix representing the momentum of a particle in this 
representation. As in §26, we shall proceed from the relation 


h 
Pq-qp=—. (48.4) 
We shall show that this relation is satisfied if the matrix p is chosen in the 


form 


h ð , 
Pax => ay Ok -*)- (48.5) 


First of all, equating the matrix elements of the left-hand and right-hand sides 





184 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 
of relation (48.4), we have 
n_h 
oaas = q xx" Px"x') dx ú =z ô =x’). 


Substituting (48.3) and (48.5), we obtain 





IES —x') 2 (x — x") —x8(x — x") is Sa” x’ | dx” =5(x —x’). 


We take the integrals in accordance with the rules of action on 5-functions 
(see Appendix III, Vol. 1). Since y5'(v) = — (y) (see eq. (III.8)), we find that 


= (=x) d@@—x')= 5-2’). 


Consequently, we have proved that the matrices (48.3) and (48.5) satisfy 
relation (48.4). If we act, according to rules (44.5), with the matrix p,, on a 
certain function W(x), then we obtain the function y(x) equal to 


h 


00) = fpa =" fE 5 —x') We") ax' = 


oa Peele ee 
= > JVE) gar 8 —X') ax". 


Integrating by parts, we obtain í 
ah 1) OW gy oh OW 
g(x) == fE- 55 ax a 


We see that the action of the matrix p,,’ is equivalent to the action of the 
operator p = (fi/i) 0/dx. The formula of transformation of the wave function 
from the coordinate representation to the momentum representation is of the 


form 


c(p) = (2h) f yœ) e- WP dx . (48.6) 
In matrix form this relation, according to (46.12), can be rewritten as 
cp) =f shv ax , (48.7) 


where si. = (2nh)-2 e~ (ipx is the unitary matrix of the transformation 
from the x-representation to the p-representation. It is natural that the inverse 
transformation is accomplished by the matrix Sy, 


W(x) = If Sypc(P) ap , ; (48.8) 


§48 CONTINUOUS MATRICES. DIRAC NOTATION 185 


where 
Sxp = (2nh)—2 eG px | 


The form of the matrix of the coordinate q in the p-representation is easily 
determined by means of the matrix Syp- According to (46.15), we have 


4p= Stq,S 
or 
(ppp = | Sias), Srp" dT da 


Substituting the value of the matrices S and q, (48.3) into the integral, we 
obtain 


l 


Te: — (G/A) p" 7 —7')elilt)T'p” ‘= 
yyy" x Je DPT 75(r —1') eT" az dr 
l PANE fh 
oo —(i/h)p't ref) p"t dr = 
Seale Te dr 
ib ote eager h a 
2n Wh)(p"—P')t dg == — ô(p"-p' 
maw JS Was wryly <1 Mer 


so that we have obtained the matrix of the coordinate in the p-representation 


ppp = 1 apt E -P')- (48.9) 
Of course, this result could also be obtained directly from relation (48.4), 
since the matrix p is diagonal in its own representation. 

Knowing the expressions for the matrices q and p, we can find the matrix 
of an arbitrary function of q and p. Thus, if H(p, q) is a certain function of 
p and q, then the matrix H can be obtained if, instead of p and q, we substi- 
tute the corresponding matrices and carry out the necessary operations 
according to the rules of matrix addition and multiplication. Here the matrix 
H, as well as the corresponding operator, is understood in the sense of an 
expansion in a power series in terms of q and p. 

Suppose that H(p, q) is a known function — the Hamiltonian of a system. 
In the coordinate representation the matrices q and p are given by expressions 
(48.3) and (48.5). Consequently, the matrix H is also known in this represen- 
tation. By means of a certain unitary transformation S this matrix can be 
transformed to the diagonal form H'=E,65 For definiteness we shall 


ne nm’ 
assume that the matrix H’ has a discrete spectrum. Otherwise 5(” — m) not 





186 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


nm Must be written 
H'=SiH(p,q)S. 
We define the matrix S. Since HS = SH’, we have 
ES, dx’ = 2 SxmHmn = Ey Sxp + (48.10) 
We consider the integrals standing on the left for different functions 


H(p, q). First of all, 


faxxSxn dx! =xSxy, 


,_h oa 
J PxxSxn & =A 5x sxn: 


Further, if U(q) is a certain function of q, then its matrix, as is easily seen, 
has the form Uyy’ = U(x) 5(x — x’), and the integral is equal to 


J UxSxnd* UC) Syn- 


We also obtain analogous results for functions of p. For example, the matrix 
of the quantity p2 is equal, according to the rules of matrix multiplication, to 


h\? 02 
(p2 „= [2 _;' 
f b (2) reid x) 
Correspondingly, the integral 


h\2 92 
ax2 xn 


,_[(h\? ¢ 2 r hec 
f @?xx’Sxin dX = (2) hae (x -—x) Sy, d = E 
Consequently, for an arbitrary function H(p, q) we obtain 
is h ð iy x 
Sf HoeeSxen 8! =H (x, = 5) Sen = AS yn - (48.11) 


Relation (48.10) is evidently none other than the Schrödinger equation 
written in matrix form in the x-representation: 


J Hee Vp)! = Ey Vy) - 


This equation is also easily rewritten in the p-representation in operator 


§48 CONTINUOUS MATRICES. DIRAC NOTATION 187 


form as well as in matrix form. Setting 


a 
A 


p2 
a S UE 


and denoting the function ,,(x) in the p-representation by c,,(p) (see (48.6)), 


we obtain 


(= y ĉo) CnP) = Enen), (48.12) 


where Ow) is the operator of the potential energy in the p-representation. 
In matrix form eq. (48.12) is 


2. 

Ba Nee r 

P- opp) + f Upp'enp'’) 90" = Encn(p) (48.12') 
Here Upp! — the matrix of the operator U — is constructed by means of (48.9) 
or, what is the same, is defined by 

Upp'= fi U(x) e- UM @—P)x dx . 
Making use of (48.11), we rewrite eq. (48.10) in the form 

Syn = E nxn 


We see that the matrix S is constructed from the eigenfunctions w,,(x) of the 
operator A 


Syn = Wn) - (48.13) 


As is known, if we have a certain operator Ê, then the matrix of this operator 
in the energy representation is given by the relation 


Fam = [VPY md 


On the other hand, the same relation can be considered as the unitary trans- 
formation from the coordinate representation, in which the quantity F is 
defined by the matrix F,,:, to the energy representation. The matrix of the 
unitary transformation is given by formula (48.13). Indeed, 


Fam =f Sh Fax Sy! m dx dx’ 
and 


J FeexSx'm 8x! = FS 


xm 


188 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


by analogy with (48.11). Consequently, 
Frm = | She PS xm 8 = f W ÊY m) dx . 


and we have again arrived at the preceding relation. Thus all the relations 
obtained in this chapter for matrices are generalized directly to the case of 
operators given in differential form. Keeping this fact in mind, we shall hence- 
forth in using the word ‘operator’ understand that the operator can be given 
in differential form as well as in matrix form. 

Finally, we shall dwell briefly on a certain notation proposed by Dirac, 
since it is frequently encountered in the literature. 

The wave function wW or, more precisely, the set of its components in a 
certain coordinate system (in a certain representation) is called by Dirac a 
ket-vector and denoted by |). For example, the wave function Y „pn, describ- 
ing the state with given quantum numbers n, /, m, is denoted by Inim). On the 
other hand, the complex-conjugate function is said to be a bra-vector and is 
denoted by (| (Vim is correspondingly denoted by imf). The terms bra 
and ket come from the word ‘bracket’ <). In matrix notation the ket-vector 
corresponds to a column, while the bra-vector corresponds to a row. The 
scalar product of the bra-vector Wh = (b| and the ket-vector Y, = la) is denoted 


by <b] a), i.e. 
JU Wale) dx = tbla). (48.14) 


On the other hand, this scalar product can evidently be treated as the wave 
function WY; in the b-representation. Indeed, if we write the expansion 


Wax) = If calb) Wp (x) db (48.15) 


(for a discrete spectrum the integral is replaced by the sum), then c,(b) 
represents the wave function of the state a in the b-representation 


Clb) = f Vœ) Val) dx = (bla) . (48.16) 


Correspondingly, the wave function of the state a in the x-representation 
W,(*) in the Dirac notation is of the form 


W(x) = xla). (48.17) 
In this notation the expression (48.15) can be rewritten as 


(ela) = f lb) (bla) db . (48.18) 
b 


§48 CONTINUOUS MATRICES. DIRAC NOTATION 189 


From (48.16) there follows the relation 
(bla) = talb)" , (48.19) 


connecting the wave function of the state a in the b-representation with the 
wave function of the state b in the a-representation. The wave function 
describing a state with given momentum in the coordinate representation 
V(r), in the Dirac notation has the form 


W p(t) = (27h) 2 e(Wh)p T= (r|p). (48.20) 


Correspondingly we write the expansion of an arbitrary function W(r) in 
terms of plane waves as 


(rv) = frip piv) dp (48.21) 
p 
or 
(r= flp phap . dez 
p 


The eigenfunction of the angular momentum operator Ê? in the coordinate 
representation in the Dirac notation has the form 


Yin (9, 9) = O, yll, m) = r7 ltil, m). (48.23) 


The function (r~!r]/, m) carries out the transition from the representation Im 
to the coordinate representation. On the contrary, the function ie (On y) = 
=(r—!r]1, m) = (1, mir!) (see (48.19)) carries out the transition from the 
coordinate representation to the angular representation. 

In the case where the angles 0,y define the direction of the momentum 
vector, the function 


Yim(9, 9) = 0, yll, m) = (p7 ! pl, m) (48.24) 


carries out the transition from the momentum representation to the repre- 
sentation /m. It is the eigenfunction of the operator £2 in the momentum 
representation. 

The matrix element F,, in the Dirac notation is of the form 


Fpa = | ¥pPvgdV = (bIF la). (48.25) 


The quantities a and b, characterizing the state of the system, can run over a 
discrete as well as a continuous set of values. If each of the states a and b is 
characterized by a set of quantum numbers, for example n',l',m' andn, 1, 
m, then the matrix element, usually denoted by Fyn’ nim OF bY PPS hase 
in the Dirac notation the form (n'/'m'|F\nlm). 





190 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


§49. The Schrodinger representation, the Heisenberg representation and the 
interaction representation 


In this section we shall discuss certain problems connected with the further 
development and generalization of the mathematical apparatus of quantum 
mechanics. We shall consider methods for the description of the development 


of a process in time. 
Up to now we have based our considerations exclusively on the 


Schrödinger equation 

sn Oh oS 

ny Ae 
according to which the wave function w(x, t) of the system could be found at 
an arbitrary instant of time ¢ if its initial value Y(x, 0) was known. In this 
approach to the development of the process in time there corresponds a 
change in the wave function of the system y(x, t). 

The development of a process in time can be described by means of the 

operator V(r) acting on the wave function defined at a certain initial instant 
of time 


We, t) = V(t) Yæ, 0) . (49.1) 


Here we have taken as the initial time the instant t = 0. Of course, we could as 
well take as the initial time an arbitrary instant ¢ = fp. Substituting expression 
(49.1) into the Schrédinger equation we obtain the equation for the operator 


V(t) 
itt ov) =AV(t) (49.2) 


under the condition (see (49.1)) that V0) = 1. If the operator Ê does not 
depend explicitly on time, then the solution of eq. (49.2) can be written 
formally as 

Pt) = e- WA | (49.3) 


where the exponent is understood in the sense of an expansion in a power 
series. 
The operator V(t) is evidently unitary 
A A 
VAKA =1. 
The unitary property of the operator Fe) has a simple meaning: it corres- 
ponds to the conservation in time of the normalization condition of the wave 


§49 SCHRÖDINGER, HEISENBERG, INTERACTION REPRESENTATION 191 


function 
Sve, Ove, nar= fy", Dve, Dar = 
= [Fv 0) Puc, Qav= fye, 0) PtP He, ar. 


Thus the description of the development of a system in time amounts to the 
fact that the wave function or the state vector W(x, £) changes in time. This 
change can be characterized by means of the unitary operator P(t) acting on 
the initial wave function w(x, 0) and transforming it at every given instant to 
the function W(x, t). Here the operators characterizing the system, for exam- 
ple the operators $, Ĥ or any operators FG, É), do not explicitly change in 
time. 

If the state of the system is characterized by means of a Hilbert space, 
then the trend of development of the system can be described in the follow- 
ing way. Let a system of unit vectors in Hilbert space be given. This system is 
defined by a system of eigenfunctions of the operators forming a complete 
set for the given system. At the initial instant of time the state of the system 
is defined by the state vector (x, 0). The development of the system in time 
corresponds to a rotation of the state vector W in Hilbert space. Its length 
(W-wW) has a constant value. Such a description of the system, in which the 
wave function changes in time whereas the operators are time independent, is 
called the Schrödinger representation. We note that the word ‘representation’ 
has in this case a more general meaning than that which it has had up to now, 
and characterizes a method of describing the change of a state in time. In 
particular, one can define the state of a system in the Schrodinger coordinate 
representation, in the Schrodinger momentum representation, in the Schrédin- 
ger energy representation and so on. Up to now, when speaking of the defini- 
tion of the wave function in one or another representation, we have borne in 
mind the corresponding Schrödinger representation. The operators X, p and in 
general Ê, as well as the operators of the corresponding derivatives with 
respect to time, $, p and Ê, do not change in time in the Schrödinger repre- 
sentation (we assume that there are no non-stationary external fields). Indeed, 
according to (31.2) the operator F, for example, has the form 


P= [H,F] 
and does not change in time, since the operators Ê and Ê do not change. The 


matrix elements of the operator F can also easily be determined: 


C= (A, F Imn- (49.4) 


In the energy representation (the Schrödinger energy representation), i.e. in a 





192 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 
representation such that the matrix A is diagonal, relation (49.4) has the form 
= FNI à 
Emn “hh H mmFmn -~ F mnnn) =10 mF mn » (49.5) 
where 
=! E E. 
Omn 7 h Em — =n) > 


Emn = [vn n)a ; 
The matrix UNa as well as the matrix (F,„n), does not depend explicitly 


on the time. 
In addition to the Schrödinger representation, use is often made in 
quantum mechanics of another representation, called the Heisenberg repre- 


sentation. 

In the Heisenberg representation the development of a system in time is 
described by means of time-dependent operators. In this case the wave func- 
tion (x) itself is assumed to be dependent only on coordinates, but to be 
time-independent. The development in the Heisenberg representation can be 
pictured as a rotation of the system of basis vectors in the Hilbert space with 
respect to the motionless state vector P(x). 

In the general case the transition to the Heisenberg representation is 
carried out by means of the unitary transformation 


b(x) = P(e) We, 1) = WO, 0), (49.6) 


where (x) is the wave function (the state vector) in the Heisenberg represen- 


tation. 
Making use of expression (49.6) and taking into account that 


P- = Pii) = elt, 
we obtain 
P(x) = W(x, 0) = eUr y, t). (49.7) 


In accord with the general rules (46.15) an arbitrary operator Ê given in 
the Schrödinger representation will, in the Heisenberg representation (we 
denote it by Pub have the following form 


Fy = PI) FV) 
or 
Ay= e(lh)Ht Fe- Uw Âr (49.8) 


§49 SCHRODINGER, HEISENBERG, INTERACTION REPRESENTATION 193 


At the initial instant of time the expressions for the wave functions as well as 
for the operators are the same in the two representations. We note that the 
operator Hl in the Heisenberg representation will be the same as in the 
Schrodinger representation. Ay = H. This immediately follows from formula 
(49.8) if it is taken into account that the operator A commutes with all terms 
of the expansion of the function e@/At We define the matrix elements of 
the operator Fy by means of the eigenfunctions of the operator AT (the 
Heisenberg energy representation) 


(Fu)mn = 2 (eH), Fyfe WOAD,, aa 


= eWMEm! Fe MEn = mn F an : (49.9) 


In the energy representation only diagonal matrix elements differ from zero. 

If the operator Ê is a function of £ and p, then, from formula (49.8), we 
shall obtain the operator F in the Heisenberg representation by using the 
operators £ and # in this representation 


Fy = Fy, Py) - (49.10) 
Indeed, if, for example, Ê = p2, then 


Py = cMAt Be — Wn) At = eM Pt ge- IM Ate Mit eM t = a 


In an analogous way it is easily verified that 


POREN E (49.11) 


We shall obtain the equation of motion in the Heisenberg representation by 
differentiating (49.8) 


aFy Ihe PRS A A A A 
-y “g Py — Fl = (A, Pal - (49.12) 


If the matrix elements of the left-hand and right-hand sides of this equation 
are taken with the functions W,,(x), we shall obtain, analogously to (49.5), 


OF y 
( ðt ) =i@mnP mn - (49.13) 
mn 


Of course, we shall arrive at exactly the same expression if we proceed from 
expression (49.9) and if we define the derivative of the matrix Fy with 
respect to time, f, as a matrix each element of which is equal to the derivative 
with respect to time of the corresponding matrix element of the matrix Fy, 





194 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


(es) = 8(Fu)mn 
ðt ðt 


Thus if the operator Êy describes a certain physical quantity, then the opera- 
tor dÊy/ðt correspondingly describes its derivative with respect to time. 

We note that eqs. (31.7) and (31.8) can also be expressed in the Heisen- 
berg representation. Making use of (49.11) and (49.12), we have 


z iw mn F mn 5 (49.14) 


mn 


Fa By Wy 
— an == = SS AC 
or m PH > ðt Ox (49.15) 
or, in the Heisenberg energy representation, 
Oxy ð . l 
( ðt ) 5 5p SH mn Fi 10,0} mn = Pu)mn ? (49.16) 
mn 


(2 ð s ðU f 
a) r ~ Fi Py mn Ji iw mn PH)mn = ( ax ) fo 4 (49.17) 

The matrix relations (49.16) and (49.17) correspond in their external 
appearance to the classical laws of Newton. 

Quantum mechanics was initially formulated by Heisenberg only as matrix 
mechanics, Heisenberg compared each mechanical variable to a certain matrix 
with elements which depend harmonically on time. The relations between 
matrices were taken in the form of classical relations, for example (49.17). 

The Schrödinger and Heisenberg representations do not exhaust all the 
methods of describing quantum systems which are used in practice. 

One very often has to deal in quantum mechanics with systems whose 
Hamiltonian can be divided into two parts: one of these, A), represents the 
Hamiltonian of the system, while the other, H’, describes the interaction of 
the given system with external fields or with other systems. 

In this case it frequently turns out to be convenient to make use of the 
so-called interaction representation, introduced by Dirac. 

The interaction representation is in a certain sense intermediate between 
the Schrödinger and Heisenberg representations. Namely, we define the wave 
function in the interaction representation by the relation 


v(x, t) = eM of yx, rye MHF (49.18) 
Analogously, we define an arbitrary operator F in the interaction representa- 
tion as 

Pigg = COMA ot Pe- WH of | (49.19) 


§50 THE LINEAR HARMONIC OSCILLATOR 195 


In contrast to (49.7) and (49.8), transformation formulae (49.18) and (49.19) 
do not involve the total Hamiltonian but only the Hamiltonian of the system 
without interaction, Ao. The equation satisfied by the function g(x, t) is 
easily obtained. For this we differentiate relation (49.18) with respect to 
time and make use of the Schrödinger equation 


~ OY(Xx, t > MA2 ar 
4 we = —Agy(x, t) + elo! (Ay + A') WG, 1) = 

= WMA ot F'y(x, t) = lA! A'e- WMH! (x, 1) (49.20) 
or, taking into account (49.19), we obtain 


-, OV(X, t)_ An 
in ease) = Hi ox, 0) (49.21) 


ie. we have obtained the Schrédinger equation with the Hamiltonian Hie 
From relation (49.19) we find the law of change in time of an operator 
given in the interaction representation 


OF int as sie A A a A A a 29 
at ~ p oF int ~ FinHo) = [Ho Fint) - (49.22) 


We note that the operator Ay has one and the same form in both the 
Schrödinger representation and the interaction representation. 

If the operator Ê depends on £ and Î, then, analogously to (49.10), it is 
easily shown that 


Fint= Fint Pind - (49.23) 


Thus we see that in the interaction representation the dependence on time 
of the wave function is defined by the interaction operator A. whereas the 
time-dependence associated with the operator Ay is directly transferred to the 
operators. 


§50. The linear harmonic oscillator 
In §10 we considered the linear harmonic oscillator by means of the 
Schrödinger wave equation. In fact, this problem was initially studied by the 


matrix method*. We shall give this solution here. On the one hand, it is a good 


* M.Born, W.Heisenberg and P.Jordan, Z. Phys. 35 (1925) 557. 


———— 





196 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


illustration of the use of matrix methods, and on the other hand we shall 
need a number of the expressions obtained in what follows. 
We proceed from the known expression for the Hamiltonian of the system 
Ê? 4 mw? x? 
lt aaa a 
Here w is the ‘classical frequency of the oscillator’. The operators p and % in 
(50.1) are understood to be certain matrices whose mutual relation is given 
by eqs. (49.15). We solve the problem in the Heisenberg energy representa- 
tion. Then corresponding to (49.9) we have 


= i t 
CH)nm = Xnm Ca (50.2) 


The indices m, n here denote the energy levels of the system. From (49.15) 
and (50.1) it follows that the operator Xj satisfies the following equation of 


motion: 





2 

Oi why =0. (50.3) 

ar2 
We see that the equations of motion in the Heisenberg representation have the 
same form as the ordinary equations of motion of classical mechanics. How- 
ever, in the former the classical coordinate x is replaced by the quantum- 
mechanical operator £y. Correspondingly, for the matrix elements x, we 
obtain, taking account of (50.2), 


(w? E ofm) Xnm Ok (50.4) 


It follows from (50.4) that only matrix elements x,,,,, for which the condition 


nm =tw or (E,—E,,)/h=+w is fulfilled are different from zero. We 
number the states in such a way that w,,_,;=t+w, and W, n+1 ==. 
Consequently, 

mao for mAén+1 and ee O soe En EE . (Gi) 


The matrix elements x,,,,,; can be determined by proceeding from the 
commutation relations 


px—XRp=—ih. 
We evaluate the following: 


2 @nk*km —XnkPkm) = ônm - 


According to (49.16) and (49.9), Png =imw,,x,,. Substituting p,,, into the 


§50 THE LINEAR HARMONIC OSCILLATOR 197 


above expression and taking into account (50.5), we have for m =n 


h 


x z, 
2mu 


nnt+1*n+1,n ae Xn,n— 1*n- ln = 
Here we have made use of the fact that w, = —@,,,,,, and instead of w,, n— 1 
and w,, p+] We have substituted respectively + w and — w. Since the matrix 
of the coordinate is Hermitian, then x,,,.=X,,,, and we rewrite the relation 
obtained in the form 


h 
2mw ` 





A 2 
Xantal IXan- il = (50.6) 
It is clear from (50.6) that the squares of the moduli of the matrix elements 
form an arithmetical progression with the difference A/2mw. Since all terms 
of the progression are positive, it must begin with a certain positive term to 
which we can assign the index 7 = 0. It is obvious that then we have x; 9 #0, 


Xo, 1 =0. Consequently, from (50.6) the equation 


pa 


be ; 
| 1,0 2mw 


follows. Correspondingly, for an arbitrary positive integer n we find 





= nh 
2mw ` 


Ix (50.7) 


nn— 1 


From (50.7) we obtain directly 


nh \i . nh \i _; 
Xine ( ) Us *n—1,n~ ( ) emer (50.8) 


2mw 2m 





where ß is an arbitrary phase factor. Making use of the arbitrariness in the 
choice of B, one can set it equal to zero. Taking into account (50.2), we 
correspondingly obtain for the time-dependent matrix elements 


nh \i . nh \? 
Connais (=) Cle: @Hn= in = (4) Gay (50.9) 


Let us determine the energy levels of the system, £,, by means of the 
matrices (50.8). The £,, are defined as the diagonal matrix elements of the 
operator H. It follows from (50.1) that 











aati] mw? 
Ann ~ Im 2 PnkPkn + 2 2 Xnk*kn ` 





198 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 
Substituting P mn = M mnXmn: We find 
=m 2 TORR 
Ann 72 |- 2 Onk2kn¥nk*kn t @ 2 Xnk*kn ] : 


Making use of the obvious equalities Ww, = — xy), Xnk = Xx, and taking into 
account that the matrix elements X, differ from zero only whenm=niz1, 
we obtain 


2 2 ef) 
H n= am (2 wx + w? 2 x3) =4m 2 (wr + wn) xik 


= 2 2 2 2 2 
Fi zm [(w? a Orn- Da 1 oF (wo? F On nt DXnnt1] o 


Making use of (50.8) and 2 ns] = w°, we have finally 


En =H 


2 {mh + + cea) is 1 
mw Goi IAG ħw(n + 5), (50.10) 


which, naturally, is the same as (10. 13). 
Instead of the operators p and X, it often turns out to be convenient to 


introduce the operator @ and the conjugate operator at defined by the rela- 
tions 


@=4V2[(mwh-1)2% + i(mwh)-7P) , 


at = 1/2 [(mwh-1)t — imh) p] . 
It follows from (50.8) that the operators @ and @ have the following matrix 
elements different from zero: 


a’), n=1= @n—-1,n= nā, (50.12) 


while all the remaining matrix elements are equal to zero. We see, consequent- 
ly, that for the operator âi only the matrix elements corresponding to the 
transition n— ln, i.e. to a transition with an increase of the quantum 
number n by one, are different from zero. For the operator @ the matrix 
elements corresponding to the transition n >n — | are different from zero. In 
this connection the operators @ and ĝt are called the excitation annihilation 
operator and the excitation creation operator. From (50. 12), the following 
commutation relation holds for the operators Gt and @: 


aai —aia@=1. (50.13) 


The operator a expressed in terms of the operators @ and ĝt has the form 


(50.11) 


A 


H = \hw@ia + aati) (50.14) 


§51 MATRIX ELEMENTS OF ANGULAR MOMENTUM OPERATOR 199 


and, taking into account (50.12), we again arrive at (50.10). Making use of 
the matrix method, one can also obtain the expressions for the wave func- 
tions of the oscillator*. 


§51. The matrix elements of the angular momentum operator** 


In studying the properties of angular momentum in §30 of ch. 3, we 
proceeded directly from expressions (30.1) and (30.2) for the angular momen- 
tum operators. In the present section we shall base our exposition only on 
the commutation relations (30.3), (30.3’). It turns out that this statement of 
the problem is of a more general character. In particular, the concrete 
expressions for the operators, (30.1) and (30.2), cannot be used for the study 
of the properties of intrinsic angular momentum (spin) which will be consid- 
ered in ch. 8. However, the commutation relations of the form of (30.3) also 
remain valid for the intrinsic angular momentum (see §60). The study of the 
properties of the angular momentum based on the corresponding commuta- 
tion relations is conveniently carried out in matrix form. We shall denote the 
matrices corresponding to the projections of the angular momentum onto the 
x-,y- and z-axes by Seat Yes The change in notation is connected with the 
fact that the results obtained in this section will be valid not only for the 
angular momentum associated with spatial motion, the orbital angular 
momentum Î= (h/i) (r X V), but also for the angular momentum which is not 
associated with spatial motion, the spin, as well as for the total angular 
momentum (see §62). We also introduce the matrix J? corresponding to the 
square of the angular momentum J? = J2 + J2 + J2. Thus we take as our basis 
the following commutation relations: : 


Fb IIE ia ind, , 
II -3S = itd, , (51.1) 
I3,-3,8, =ihdy. 


First of all, we obtain the following additional commutation rules from these 


*L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965). 

** The problems touched upon in this and next sections of this chapter are consid- 
ered in more detail in the books: L.D.Landau and E.M.Lifshitz, Quantum mechanics 
(Pergamon Press, Oxford, 1965), and E.Condon and H.Shortley, The theory of atomic 
spectra (University Press, Cambridge, 1951). 


200 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


relations (the proof is analogous to that presented in §30): 


J J-J, =0, 
J,3?-PF, =0, (51.2) 
S32 2T, = 


We choose the representation in which the matrices 27, and A are 
diagonal. Indeed, in §47 we have proved that mutually commuting matrices 
can simultaneously be brought into diagonal form. The commutation of a 
given matrix with the matrix A expresses the law of conservation of the 
corresponding LENS (see §32). Hence the assumption of the commutation 
of the matrices J? and J, with Ê only means the fulfillment of certain conser- 
vation laws. 

We number the columns and rows of the matrices considered by the 
indices mm, j, n. The real number m defines the projection of the angular 
momentum onto the z-axis, J, = mñ. The number j characterizes the value of 
the total angular momentum, and the number 7 is associated with the energy 
level of the system. Since all the matrices considered commute with the 
matrices J? and H, they are diagonal in the indices j and n. Consequently, 
making use of the Dirac notation, we can write the matrix elements of the 
matrices in which we are interested in the form 


(m'‘j'n'|H|mjn) = Enj’ Òmm' Onn! > 


m'j'n'|32|mjn) =J26.,6 


j ij mm' Oma 


m'j'n'lÎ mjn) =mħő;j'mm Snn’ > (51.3) 
mi'n IJ Amin = (J~)mn'md;7'O nn’ > 
(m’j'n'|J,,|mjn) = ie z 


Here We have denoted the eigenvalue of the square of the angular momentum 
by JP. For what follows it will be convenient for us to introduce also the 
TARS I, =I. +i, and J- =d., Si It is obvious that these matrices, as 
well as the el matrices J. and A me diagonal in the indices j and n. Taking 
this into account, we shall in what follows drop the indices j and n. 

Our problem is the determination of the spectrum of possible values of the 
projection of the angular momentum onto an arbitrarily oriented axis, the 
establishment of the relation of these quantities to the absolute value of the 
angular momentum (J7 ?):, and finding the matrices (J. m'm and Jy) nin 
First of all we shall athens that the spectrum of possible values of the projec- 
tion of the angular momentum for a given total angular momentum is 


§51 MATRIX ELEMENTS OF ANGULAR MOMENTUM OPERATOR 201 


bounded both above and below. For this we make use of the matrix relation 
T2 E2 2 
J*— Jz SEA 


Equating the diagonal matrix elements of the left-hand and right-hand sides, 
we obtain 


Jp mn? = 2 l)n xem x Syme xm] F 2 NE ar IJ y)mk? a 
(51.4) 
Here we have made use of the Hermitian property of the matrices Jand J}. 


Thus the right-hand side of eq. (51.4) is undoubtedly not negative. Whence 
the inequality 


mn? <I? (51.5) 


or - 02} <mh <P follows. We denote the values of the quantum 
number m corresponding respectively to the largest and smallest possible 
values of the component of the angular momentum along the z-axis as mı 
and m3. We find the spectrum of possible values of the number mm by means 
of the matrices J, and J_. For this we find the commutator of these matrices 
with the matrix oe Making use of (51.1), we obtain 


a ie 


eters 5: (51.6) 
IG SLATS 


We evaluate the first of these relations 
SS )m'm" a Vidz) (nis ng, Jmm” : 


Calculating the matrix element of the derivative according to rule (45.6) and 
taking into account that the matrix J, is diagonal, we find 


hom’ —m") Jmm" “É mm" - (51.7) 


It follows from eq. (51.7) that the matrix Î, has non-zero matrix elements 
(J .)m'm” only under the condition that m' —m" = 1, i.e. for transitions corres- 
ponding to an increase of the quantum number m by unity, m >m + 1. In an 
analogous way it is easily shown from the second equation, (51.6), that the 
matrix T has non-zero matrix elements only for transitions with a decrease 
of the quantum number m by unity, ie. m >m- l1. Thus we arrive at the 
conclusion that, if for a given iH a certain value mh of the z-component of the 
angular momentum is possible, then the values (7 + 1), (m —1)h, (m + 2)h, 
(m—2)fi and so on are also possible. However, we have explained before that 
the spectrum of possible values of the number m must be bounded: 





202 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


mz <m <m). Setting m" =m, in equality (51.7) and taking into account 
that in it m’ cannot assume the value m} + 1, we see that it is fulfilled only 
when the matrix element Vim, +1 m reduces to zero. Consequently, 


(m; + IJ, lm)=0. (51.8) 


We have an analogous situation for the minimum possible value of the number 
m. The corresponding equality is fulfilled here when the matrix element 
Nir ns reduces to zero, 


(my —1|J_lm)=0. (51.9) 


Thus the possible values of the angular momentum component are equal to 
moh, (ma + 1)h, (ma + 2), ..., (m — 1h, mh. Here the difference m,—my 
can only be equal toa PONR integer (including zero). We saeit show that 
the values of the numbers m; and /n determine the quantity Ip. Indeed, the 
matrix J2 can be written in the form 


J=, +J2+AI,. (51.10) 


Taking the diagonal matrix elements of the left-hand and right-hand sides, 
corresponding to the transition m; > m}, we have 


Dre Pd) 2 
J= 2 Umik )km, +h’ my thm]. 


Here only k=, +1 is possible, but then the matrix element of J, (51.8) 
reduces to zero. Consequently, 


P=n?m(m, +1). 
On the other hand, eq. (51.10) can also be rewritten in the form: 

=J, f +32—ni, . (51.11) 
If in the above expression one equates the diagonal matrix elements m > m3, 
then 

i = ñm (m3 — 1) 
and, consequently, 

mn + 1)=m(m,-1). 


This equation is satisfied under the condition that mọ=m;+1 and 
m=- m). Since, however, ma <m; always, we must retain only the second 
root, m = — m}. Consequently, the maximum (equal to mh) and minimum 
(equal to mñ) possible values of the projection of the angular momentum 


§51 MATRIX ELEMENTS OF ANGULAR MOMENTUM OPERATOR 203 


onto the z-axis are equal in absolute value. As we have shown, the square of 
the total angular momentum is equal to mm, + 1). On the other hand, 
we decided to characterize this quantity by the quantum number j. Hence it 
is natural to set m4 =j. In this case we have 


I =R?jG +1). (51.12) 


The possible values of the angular momentum component J, are correspond- 
ingly equal to 


J,=jh, G—1}i, (G—2)ħ,..-,(—j+1)}i, —jħ. (51.13) 


On the whole the angular momentum component assumes 2j + 1 values. We 
note that, since 2j + | is a positive integer, the quantum number j can take on 
only integer or half-integer values, j=0,3, 1,3 and so on. For the orbital 
angular momentum this number, as we have explained in §30, takes on only 
integer values 7 =/. We shall see however, in ch. 8, that for intrinsic angular 
momentum j can also take on half-integer values. 

Since the z-axis has in no way been singled out beforehand, the angular 
momentum projection onto any other axis is also given by formula (51.13). 
We note that if the number j is integer, then the angular momentum projec- 
tions onto any axis are also integer (in units of 7); but if j is half-integer, then 
the angular momentum projections t take on half-integer values. 

Let us now find the matrices Ie and fy: For this we can make use, for 
example, of the relation (51.10), taking the diagonal matrix elements of the 
left-hand and right-hand sides. Also taking into account (51.12) we have 


n3 +1)= 2 mkkm + h2m2 + h2m , 
where only the term of the sum with k =m + 1 differs from zero. 
From the Hermitian property of the matrices J, and J, it follows that 
Jem a Imk 9 
Consequently, the preceding equation gives 
m+ 14m? =h2[j(j + 1)—m(n + 1)] =A -m)(j+m+ 1). 
For the matrix element (J,) +1 m We have 
m + 1J,\m) =h[(i—m) (i +m + 1)]2 eit. 


Without restricting the generality, the phase $ can be put equal to zero. We 





204 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


finally obtain 
dm + IAD = Iy + iy) tam = RIG —m)G tm + 1}? , 


4 3 F A (51.14) 
(m|J_lm + 1)= (ziJ y)m,m+1 = ALG — 2) G +m + 1)]? , 
ie. 
G)m+ lym 7 J m,m+1 9 (51.15) 
From the definition of the matrices J, and J_ it follows that 
P=, F,=4-1G.-F). 
Making use of (51.14), we get 
(m + 1|J,|m) = On|, lm + 1) = th [j—m) G +m + 1)]2 . ides 
(mn + \|Jylm) = — (ml J, lm + 1) = — }i [G - m) G +m + 1)]? . ; 
As an example we write the matrices which are obtained for j = 1: 
Cui Go x1, -1 OREN 
h 
Js= Jx)o1 Ooo Oxo- | =x} 1 9 ! 
-11 V-o Yx)=1,-1 O 1 0 
0 —i 0 1 0 0 
|) 
= i 0 —i |; J,=h| 0 0 0j, (51.17) 
0 i 0 00 -!1 
OMe O 
J2=H2-2}0 1 0 
ORO il 


§52. The addition of angular momenta 


We now determine the possible values of an angular momentum J which is 
equal to the sum of two angular momenta, J =J; + J}. Let J, and J, be the 
angular momenta referring to two sub-systems whose mutual interaction can 
be disregarded. This means that the operators îi and ip act on variables refer- 
ring to different sub-systems and, consequently, commute with each other, 


§52 THE ADDITION OF ANGULAR MOMENTA 205 


i,J, = ÎS. Since each of the operators 3, and 15 satisfies the commutation 
relations (51.1), the operator J also satisfies the same commutation relations. 
The state of the system will be defined if the cueman numbers j}, ja and m}, 
mə characterizing the total angular momenta it and 3 and their projections 
onto an arbitrarily oriented z-axis are defined (we digress from other quanti- 
ties contained in the total set, since they are not essential for what follows). 
For given jį and jọ each of the numbers mm, and mọ runs respectively over 
(2j, + 1) and (2/2 + 1) values. Consequently there correspond (2j; + 1) (2/2 + 1) 
states to given numbers jį and jọ. However, the state of the system may be 
characterized by the numbers /,/2,7 and m, where j and m are the quantum 
numbers corresponding to the total angular momentum Î and its projection 
on the z-axis instead of by the four numbers j;,/, 711,72. This means, in 
essence, the transition from the representation /,/2, 711,77 to the represen- 
tation 7,,/2,j,m. Indeed, the operators corresponding to the four latter 
quantities can enter into the total set just as well as the operators correspond- 
ing to the quantities j}, j2, M], M3. Since the obvious equality J, =J,,+J2, 
holds, then the quantum number 7 is equal to the sum m =m} +m. Of 
course, such a simple relation does not exist for the squares of the angular 
momenta and we have to determine the possible values of the number j for 
given jį and />. First of all we note that the maximum value of the number j 
is obtained if we take the largest m}, equal to j}, and the largest m3, equal to 
ja. Consequently, in this case j =j} +/2. Further, we consider the following 
possible value of the number m: m=j,+/2—1. Such a value of m can be 
realized either for m,=/,, m2=/,—1, or for m,=j,;—1, mz=/. Thus 
two independent states correspond to the given value of m=j,+/2—1. 
Consequently, two possible values of the number j must correspond to the 
given m. But because the largest possible value of j is equal to j} +j and 
because the number m cannot be larger than the number j, it is clear that to 
the chosen m there can correspond only j=j} +/2 and j=j; +72—1. Choos- 
ing m one unit smaller, we obtain three states corresponding to given m: 

(1) m;=jp m=j2—-2 

(2) m,=f,;-1, m,=j2-l; 

(3) m,=J;-—2, m=). 

By analogy with the foregoing we arrive at the fact that the number j can 
assume the values j=j] +j2, J =j] tj2— 1 and j=j; +7,—2. Going on with 
this reasoning we find that for given j} and jọ the number j can assume the 
values 


j=j tiz TWytig=1V3 Ty tj = 2o] i2: (52.1) 


On the whole the number j assumes 2j + 1 values (under the condition that 





206 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


j2 <j); otherwise the indices 1 and 2 must be exchanged). The total number 
of states corresponding to given j} and /> is equal to 

Fi tha 

D += +NCtD, 

JAN —J2 
as it should be. This result was obtained earlier in the so-called ‘vector model’ 
which was introduced before the appearance of quantum mechanics. In the 
vector model it is assumed that the length of the vector j formed by adding 
two angular momentum vectors jį and jọ can change by unity in a discon- 
tinuous manner. The length is maximum when these vectors are ‘parallel’: 
j=, +j2, and minimum when they are ‘antiparallel’: j = j; — j2- 

In the case where three or more angular momenta are to be added, we 
apply the rule which has been derived, adding them in pairs. 

As well as the possible values of j one can find the probability that the 
total angular momentum of the system is equal to one or other possible value 
of j for given jį and j}. For this, according to general rules (§21), it is neces- 
sary to expand the wave function of the system describing the state with 
given values of j}, j2, 77,77 in terms of the wave functions Wj, of the states 
with given j4}, j2, j, m. Since the initial system is divided into two non-interact- 
ing sub-systems, its wave function with given j}, my, j2, Mm can be written in 
the form of the product of two functions which refer respectively to each of 
the sub-systems W; 5, 991: = Yim) Yigg (2). The expansion has the form 


= j 
Vim, Ym x Cinymy Yim tm + (52.2) 


The squares of the moduli of the coefficients Chim determine the proba- 
bilities sought. We note that, since the transformation of wave functions from 
one representation to another is carried out by unitary matrices, we can write 
the expansion which is the converse of (52.2) in the form 


Yim 5 2 (Gb samp)” Yiym —m, (1 ) Vim, (2) 2 (53.3) 


The coefficients ci, m, Were calculated by Wigner by a group theory 
method. The reader will fnd a sufficiently extensive table of these coeffi- 
cients in, for example, the book of Condon and Shortleyt. We shall confine 
ourselves to the consideration of the simplest case where one of the angular 


+ E.Condon and H.Shortley, The theory of atomic spectra (University Press, Cam- 
bridge, 1951). 


§52 THE ADDITION OF ANGULAR MOMENTA 207 


momenta Hs equal to one half, and the other is arbitrary. Thus we shall assume 
that 72 = 2. For iver jz the number 77 runs over only two values, namely 

aa =4 and m= —4. We can correspondingly rewrite the expansion (52.3), 
dropping all superfluous indices, in the form 


Vim = Ci Vj m—4 C1) Vy) + C Yj, m+) V 4Q)- 62.4) 


We act on the left- hand and right-hand sides of this expansion with the opera- 
tor P = â; ar j,)?= ij + 33 + 23, J.. The scalar product on the right is conve- 
niently loned so that 


§2 = 32 +33 + Gy, + Sy) iay) + ix isy) Gry t Say) + Wipdo. = 
= 32+ 5245.5 tI at Wi Sy - (52.5) 


Acting on the left-hand side of (52.4) with this operator we obtain Aj + xe 
When acting on the right-hand side it is convenient to write the operator j2 
the form (52.5). In this case it should be recalled that each of the Sree 
ĵi J, acts only on the wave function of the corresponding sub-system. It 
follows from the matrix relations (51.14) that 


ETOO 
P y Oñ 0), 
Pasti (2) AW), 
Fis Vim CQ) FAG —m + DG, +m + HEY, mar), 
Îi Wj mse) = AG —m + Gt m+ DY, m1) - 
Making use of these relations, we obtain the following equation: 
TG +1) Vim = [GGG + 1) +5 +m) + 
+C {tmt} —m +H} Vy, mae CD) W112) + 
+ [C00 +1)+4-m)+ 
+Ci{Gy—m +4) Gtm DEI Vim OY 0). 


Again substituting the expansion (52.4) for w,,, into the left-hand side and 
equating the coefficients of the same functions W(1) (2), we obtain two 
equations with respect to C, and C_1. However, of these two equations only 
one will be independent. It gives : 
aia dui pYa gee 
aie gee ED sree (52.6) 
7 FG, tm+4)G,—m+tp]? 





208 THE MATRIX FORM OF QUANTUM MECHANICS Ch. 6 


We shall obtain the second relation if we take into account that the squares 
of the moduli of these coefficients are equal to the corresponding probabili- 
ties 


[Ga 4l? +C; =n, (52.7) 


The relations (52. 6) and (52.7) determine the coefficients Cı and C_: to 
within the immaterial phase factor eî% (we choose the phase in correspon- 
dence with that taken in tables of the coefficients Chere: see the book of 
Condon and Shortley). Since for jọ =% the total angular momentum j can 
take on only two values j} +4 and j} —}, we obtain the following values of 
the coefficients Chi ma (see table 1). 


Table 1 
The coefficients ch ima 
NEI“ EEE 
i m=} m= -3 
a a S, O 
jutm+t\? jy—-m+t\3 
‘itt 7 ; 
27,41 2j,+1 
SAN jizm+}\? jitm+})\? 
imn tl Wt 


<< ae ee O iie 





Perturbation Theory 


§53. The theory of time-independent perturbations 


The Schrödinger equation is a linear differential equation in partial deri- 
vatives with variable coefficients. Its exact solution can only be found for 
simple problems, some of which were considered in preceding sections. 

In general the exact solution of the Schrödinger equation is associated with 
considerable mathematical difficulties. Hence a number of approximate 
methods of solving it have been devised. One such method is that of the 
quasi-classical approximation already considered. Another very important 
approximate method is the so-called perturbation theory. The term ‘perturba- 
tion’ and the idea of this method, which consists of a particular variant of the 
method of expansion in terms of a small parameter familiar in mathematics, 
were introduced into quantum mechanics by analogy with the perturbation 
method of classical mechanics. The latter played a particularly important role 
in solving problems of celestial mechanics. 

We shall discuss perturbation theory in a general form. Its applications to 
the solution of actual problems will be illustrated in what follows by numer- 
ous examples. 

Let us first of all consider the simplest case of a quantum-mechanical 
system in which the Hamiltonian operator Hf does not depend explicitly on 
the time. 


209 


PERTURBATION THEORY Ch. 7 






We assume that the operator Å can be written in the form 
H=Ap+H', (53.1) 


where the operator Ĥ' can be considered to be small in comparison with the 
operator Ho (the meaning of the word ‘small’ will be explained below). Then 
the Schrödinger equation takes the form 


(Apt A')w=Ev. (53.2) 
| We further assume that the solution of the equation 
| à yO = EO O (53.3) 
| 4 is known. Then for the solution of eq. (53.2) use can be made of what is, in 
| ai essence, a method of successive approximation. In what follows the Hamil- 
\ | E tonian Ho and the wave function y© will be called unperturbed, while the 
i l | operator H’ will be called the perturbation operator. The ‘smallness’ of the 
| operator A’ means that under the action of a perturbation the state of the 
} 


E system changes relatively little. Our problem is to find the solution of the 
| Schrödinger equation assuming that the wave function y®) of the unper- 
turbed system is known. We shall consider the perturbations of states belong- 

ing to the discrete spectrum of the operator Hg. However, the operator Ho 

can have eigenvalues corresponding to a continuous spectrum as well as ones 


belonging to the discrete spectrum. 
We seek a solution of eq. (53.2) in the form of a series in terms of the 


eigenfunctions of the operator Ho 


v(x) = 2 cv. (53.4) 


If the operator Ao also possesses a continuous spectrum, then we have to add 
the corresponding integral taken over the continuous spectrum to the sum 
(53.4). Substituting the sum (53.4) into eq. (53.2) and taking into account 
(53.3), we obtain 


2 A’, yO= 2 cE- EP) yO. 


We multiply the left- and right-hand sides of the equation by yo" and 
integrate it over the entire region of variation of the independent variables. 


Making use of the orthogonality of the functions vi, we find 
c (E — EO) = x acres rm = l; 2,3, E (53.5) 





oe 
> ee ol es sal 


§53 TIME-INDEPENDENT PERTURBATIONS 211 


where 
H nk =f WO" AY dV (53.6) 


is the matrix element of the perturbation operator calculated using the unper- 
turbed wave functions. The system of equations (53.5) is exactly equivalent to 
eq. (53.2). It represents the Schrodinger equation in the energy representation. 
We now make use of our assumption of the smallness of the perturbation 
operator. The energy levels and wave functions in our problem will be close 
to those for the unperturbed system. Hence we shall look for them in the 
form of the following series: 


E=£O0+FO+FO+.., 
(53.7) 


cO4+ D4 De 


Cm” m m m 


Here E©) and c(®) are unperturbed values. The corrections E) and cQ) are of 
the same order of small quantities as the perturbation; E@) and cD are 
quadratic in the perturbation and so on. 

We find the correction to the nth energy level and correspondingly to the 
nth eigenfunction of the unperturbed system, confining ourselves to terms up 
to the second order of small quantities: 


B=E©) + ED 4 £Q) | 


(0) 4 .(D Dp 
m= lm Cm ten 


(53.8) 
G 


In this section we shall assume that the nth energy level of the unper- 
turbed system is not degenerate. For the other levels this assumption is unnec- 
essary. In the zero order approximation the wave function is the same as the 
function yO. This gives 


y= 2 cCOVO=y ie. cM=5,,. (53.9) 


Substituting (53.8) into eq. (53.5), we get 
1 2 0 2)) = 
Ginn t E y ED) EL ) -EQ t EO jt ja% )) a 


m 
= 2 Hmkê pp t eD + co). (53.10) 


kn 


In eq. (53.10) one has to equate terms of the same order of small quantities. 
For the terms of the first order we obtain the relation 


EP-ER) + En mn = Z Hmkörn = Hmn : (53.11) 





d 
4 
E 





212 PERTURBATION THEORY Ch. 7 


Setting m = n, we find 
EL) = Hin = J VD HYD AY . (53.12) 


We see that the first order correction to the energy level is equal to the 
mean value of the perturbation energy in the unperturbed state y. From 
eq. (53.11) for m #n we find the correction of first order to the wave func- 
tion 
a: (53.13) 

n m 


We now write the equation for terms to the second order of small quantities: 


ofl) = 


A 2 2 ani 
Eo = EW) ce) i ey + EY on is 2 H mke ) , (53.14) 


Setting m#n, we find from eq. (53.14) the correction of second order of 
smallness to the unperturbed wave function 


= 1 , 1 1), 
ch = O O (2 H ph — Eni W) : (53.15) 
The value of the amplitudes of) and cP can be obtained from the normali- 


zation condition which, taking into account (53.4), can be written in the form 


2, le, l2=1. (53.16) 


Substituting expansion (53.8) into (53.6), we obtain 


a Sin tow + 2 = 1. (53.17) 


We equate quantities of the same order on the left and on the right. Then we 
have 


De cV=9, D+ 2 IcWP=0. (53.18) 


It follows from the relations (53.18) that the imaginary parts of the ampli- 
tude cD and cO are arbitrary quantities. The appearance of this arbitrariness 
is associated with the fact that the wave function is determined to within the 
phase factor e!%, where a can also be written in the form of a series. In corres- 


§53 TIME-INDEPENDENT PERTURBATIONS 213 


pondence with this, without restricting the generality we can assume that 


, Aa 
cD=0 ‘Duet Liles 
4 Š z 2 k (EO E 
X (ES -E$ ) 
Here the prime on the sum indicates that the term with k =n is excluded in 


the summation. 
From (53.15) we find ec) 


à HmkHkn Hinna 


mn’ nn 
k (EO EW) (EX (0) — EO) EV- EO)?” 


Setting m =n in eq. (53.14), we find the second order correction to the ener- 
gy level of the system: 


EQ) = peo (53.21) 
k EO d EO) 
The second order correction to the basic energy level turns out to be negative 
irrespective of the character of the perturbation. As follows from (53.8), 
(53.12) and (53.21), the energy of the system, to an accuracy within terms of 
the second order of small quantities, is equal to 


[Hj xl? 
(0) eS 
B= Er t ep k EO és EO) j 
n 


In an analogous way we obtain the expression for the perturbed wave func- 
tion of the system 





(53.19) 


c= 


Cm 





mF#n. (53.20) 


(53.22) 


ire WFT 
= yO D kn 0) 
y= + — yO. (53.23) 
e EO- Ep 
(We have written this formula only to within an accuracy of terms of the 
first order of small quantities.) 
It follows from expression (53.23) that the first order correction will 
indeed be small if the following inequality is satisfied: 


lHgnl S IE® — E) . (53.24) 


Thus the perturbation theory method developed above is applicable if the 
matrix elements of the perturbation operator are small in comparison with 
the spacing between the corresponding energy levels of the unperturbed 
system. 





214 PERTURBATION THEORY Ch. 7 
§54. Perturbation theory in the presence of degeneracy 


We now assume that the eigenvalues of the unperturbed operator Ho are 
degenerate and that the multiplicity of the degeneracy of the mth level (with 
energy E®) is equal to s. 

This means that the state of the unperturbed system with energy £,, is 
described by mutually orthogonal wave functions vo, ..., WO or by arbitrary 
linear combinations of them which can be chosen in such a way that the 
wave functions are, as before, orthogonal. When a perturbation is imposed, 
the eigenvalues of the operator A as a rule turn out to be non-degenerate or 
in any case the multiplicity of the degeneracy decreases. This fact is closely 
associated with the very nature of degeneracy. We have already pointed out 
in §35 that degeneracy is always associated with a symmetry of the Hamil- 
tonian with respect to a definite class of transformations of the coordinates 
of the system. The perturbation, as a rule, does not possess the same symme- 
try. Hence the resulting Hamiltonian of the perturbed system will not have 
the previous symmetry and its energy levels will not be degenerate. Thus the 
perturbation removes the degeneracy. For example, in considering motion in 
a centrally symmetric field we have seen that the (2/+1)-fold degeneracy of 
the energy levels is associated with the symmetry (invariance) of the Hamil- 
tonian with respect to rotation of the system about the centre of force. If the 
system is now placed in an external field, then the total Hamiltonian will no 
longer possess spherical symmetry. The perturbation (in the given case, the 
external field) removes the (2/+1)-fold degeneracy corresponding to the 
components of the angular momentum. 

On imposing the perturbation the degenerate energy level splits into s close 
levels. To each of these energy levels there corresponds a wave function which 
is a linear combination of the functions y() 


= 0 
Vie 2 Given: (54.1) 
m,r 


As before, we consider the perturbation to be small and, in the first approxi- 
mation of perturbation theory, we seek the nearby energy levels (they are 
often called sub-levels) into which the degenerate level splits. At the same 
time we seek the corresponding set of wave functions in the zero order 
approximation. That is, we have to find, in the zero order approximation, 
correct expressions for the amplitudes c,,, in the sum (54.1) such that the 
linear combination (54.1) will correspond to one of the sub-levels into which 
the initial energy level splits and that it will undergo only a small change when 
the perturbation is taken into account in the next approximation. 


§54 DEGENERACY 215 
Let us first consider the case of two close levels. In this case formula (54.1) 
gives Y =c,W + c7W. Substituting this value into the Schrodinger equation 
(53.2), we find 
—c (E-E) + Hy9e. + Hy,c,=0, 
cıH21— cE — EW) + cH =0. 
Setting E = E© + E0), we obtain the system of homogeneous equations 
(H,,-E™) ce, + Hy9¢2=0, 
Hc, + (Hy2-E™) 2 =0. 


The condition for this system to have a solution is that the determinant of 
the coefficients be equal to zero, 


Hy,-EY Hy 
A, Hy- EO) 
Hence 


(Ay, — ED) (Hy — EM) = H121? 3 


EQ) = 4H; + Hoy) + i — Hyp)? + 4H, 12]+ . 


We see that the degenerate level splits into two levels corresponding to the 
two different signs in front of the square root. 

If the perturbation is small, so that |H I? < (Hi1 —H9)*, then we come 
back to the case of two independent levels whose energies are equal to 


Hal? 
B= EO + EO = 24H, t 12l 
-H 
11 22 
2 
PEON — 2 : 
1-2 


But if the levels are lying so close to each other that 1H; > (Ay, - Hy)’, 





216 PERTURBATION THEORY Ch. 7 
then we obtain 
(A, —H 9)? 
By = EO + 3(Hy) + H22) + 1Hy 9!" esa |: 
12 


(Hy, — H 2)? 
= (0) 1 =” py a Sie ie 
Ey = EO +3(Hy, + Ao) — IHl SIH 


Analogous results are also obtained in the general case of s-fold degeneracy. 
Substituting (54.1) into the Schrödinger equation (53.2) we obtain analo- 


gously to (53.5) 
Cmp(E — EO) = 2 lel io (54.2) 


where 
' E 0)* Ar, (0 
Amp:kr = fy H Whe KIZ 


If we are interested in perturbation of the energy level £„, then we have to 
put m = n and to equate terms of the first order of magnitude. But in the zero 
order HB ve. the c0) the wave function W is the superposition of the func- 
tions Ypy» i.e. the c are different from zero only for k =n. Writing the en- 
ergy E in: eq. (54.2) i in the form E = gO. )+ ED we get 


s 
EM = D Hp (54.3) 


(we have dropped the fixed index 7 in the notation). 

The system of homogeneous equations (54.3) has a non-trivial solution 
only in the case where the determinant of the coefficients of the unknown 
quantities is equal to zero, i.e. under the condition 


Hy,;-E® Hip cou e 
Hy Hy,- ED... Hyg =0. (94.4) 
Farce Gta Oe H- EO 


This equation is called the secular equation. The secular equation is an equa- 
tion of the sth order with respect to E) and has, consequently, s roots. Soly- 
ing it with respect to ED, we find s values for this quantity. This means that 

the nth pew level splits into s sublevels EW +EP, EO + EM, 
EW +E), In particular cases certain roots of the Soie equation turn out 


§54 DEGENERACY 217 


to be equal to one another. In this case the perturbation only partially re- 
moves the degeneracy in the system. 

Substituting the values E{) found into eq. (54.3), we can determine the 
amplitudes c corresponding to a given correction to the energy ED. By this 
means we find, in the zeroth approximation, the correct wave functions 
corresponding to the energy sub-levels into which the level Eo splits. These 
wave functions are slightly distorted under the action of the perturbation. 

The method discussed is also applicable in the case where the eigenvalues 
of the operator Ho are not degenerate but are so closely spaced that the 
inequality (53.24) is not satisfied*. 

As an example of the application of the methods discussed in this and the 
preceding sections we shall consider the displacement of the lowest energy 
level of the hydrogen-like atom, and the splitting of the first excited level, 
caused by the finite size of the nucleus. 

In considering hydrogen-like atoms we assumed that the electron was in 
the Coulomb field of the nucleus. However, the difference between the 
correct field and a Coulomb field in the region of the nucleus itself was not 
taken into account. We now assume the nucleus to be a uniformly charged 
sphere of radius ro. Then the potential energy of the electron for r < rgo has 
the form 





2 2 
un = —-22 (3- r z) (54.5) 
My NEA A 

0 
The difference between the potential energy of the electron and its value for 
a pure Coulomb field is the perturbation Hamiltonian 





2 2 2 
Ze G bert, ) 422 PE 
~ To \2 272 r 
Êĝ' = 0 (54.6) 
0 r>ro- 


We define the correction to the ground energy level in the first approxima- 
tion: 


ED = Hop = [vGH'voaV. (54.7) 
The wave function of the ground state (38.22) is 


Wo = 2(Z/a)? e~27/4(471)-3 . (54.8) 


* For more details see V.A.Fok, Nachala kvantovoi mekhaniki (Principles of quantum 
mechanics) (KUBUCH, 1932) p. 92. 





218 PERTURBATION THEORY Ch. 7 


Substituting (54.8) into (54.7), we obtain 


r ro 2 2rĝ 


To 
3 2 2 2i 
ED=Z4 if eae _ Ze* 3-5) r2dr. (54.9) 
a? 

0 
Since the radius of the first Bohr orbit is a ~ 1078 cm and ro = 10— 12 cm, 
the exponent of the exponential in (54.9) is very small, and the exponential 
can be replaced by unity. Integrating the integral in (54.9), we find 


2 
274e2 (T0)? _ 4 ro 
l= = = 27) 2 5 
EC a= a FA Z2. (54.10) 





Even for the heaviest atoms, Z ~ 100, and the ratio EWEA ~ 1074. 

Let us now consider the first excited level n = 2. As we have shown in §38 
this level will be 4-fold degenerate (the states W200; W211, W219» Y21—1). We 
shall number these wave functions by the index s = 1, 2,3, 4 respectively. It 
is clear already from general considerations that the perturbation will partial- 
ly remove the degeneracy. Indeed, in the Coulomb field we have degeneracy 
in the two quantum numbers / and m. The degeneracy in the quantum 
number / is specific for the Coulomb field. However, the degeneracy: in the 
magnetic quantum number m occurs in an arbitrary centrally symmetric field. 
In view of the fact that when the perturbation is taken into account the field 
is no longer strictly a Coulomb field, although it will remain a central field, 
the degeneracy in the quantum number / is removed. 

Thus we can expect the level with n = 2 to split into 2 levels, with n = 2, 
1=0 and n= 2,/= 1. We shall show that a calculation does indeed lead to this 
splitting. 

The secular equation in this case will have the form 





Hi- EY Hi2 Hi3 Hig | 
Hy Hy, — EO) H33 Ha | 
; ; i On E 
H31 Big ga) Hag 
i, yay PA) 





The matrix elements are taken with respect to the functions Wim: W1 = W200 
Wo = Wars Y3 = Vao and W4 = W211: R 

In view of the fact that the perturbation operator H’ (54.6) depends only 
on the coordinate r, all non-diagonal matrix elements in (54.11) reduce to 
zero because of the orthogonality of the spherical functions (30.18). Indeed, 


§55 TIME-DEPENDENT PERTURBATIONS 219 
integrating with respect to the angular variables, we obtain 
J ¥im Yim Sin 9 dð dy = 5175 


Making use of (38.22)—(38.24), we obtain for the diagonal matrix elements 
(since the integral with respect to angular variables is equal to unity): 


mm' ` 





A 
73 0 Py z p- [ - 2 a 2 = pe) 

PE =m (: -2) . zra (Z _ 3202 Zer) aar, Gaby 
2a3 r <a F 2ro 23 
0 

Sy Caw 72728 Zea eC AR 
Hyp = H= Hig =~ f e~ Zrla 2r = 3Ze +e ) Par. 
24a” 6 a~ r 2ro 2rò (54.13) 


Neglecting terms of the order rg/a in comparison with unity, we get 











74,2 /r 2 
(jal Zire (2) À 
wany | (a J (54.14) 
2,2 [Zro\4 
ae ey c) 
D 1120 a a . (Saullo) 


We see that the original level is split into two sub-levels. The displacement 
of each of them with respect to the position of the original level is given by 
formulae (54.14) and (54.15). The value of the shift of the level n = 2,/=0 is 
smaller by about an order of magnitude than the shift of the level n = 1,/=0. 
The shift of the level n =2,/=1 is even smaller owing to the small factor 
10~3(Zro/a). This is due to the fact that the electron in the state n = 2,/=1 
is, in the main, outside the region of the nucleus and that the distortion of 
the Coulomb field in this region affects its state very little. 

Finally, we note that the corrections considered here turn out to be consid- 
erably more substantial for mesic atoms. This is associated with the fact that 
mesons are much heavier than electrons and hence are in the main much more 
near the nucleus (see §38). Thus for the u-mesic atom the relative shift of the 
level with /=0 is larger by a factor of about 4X 104 than for the ordinary 
atom, and becomes an appreciable quantity. 


§55. The theory of time-dependent perturbations 


Perturbations acting on a quantum-mechanical system very often have a 
nonstationary character (i.e. depend on time). This means that the perturba- 
tion operator A is an explicit function of time H(t). Numerous examples of 





220 PERTURBATION THEORY Ch. 7 


such perturbations will be given below. We assume that the stationary states 
of the unperturbed system are known, i.e. that the wave functions POC, t= 
= YO) exp[—(i/h) £,,t] the unperturbed equation 


BY, 2) a 
n= OV t). (55.1) 


are known. We restrict ourselves first to the simple case where the states of 
the unperturbed system belong to a discrete spectrum. 

If the system is acted upon by a small perturbation described by the opera- 
tor A'(), then the wave function of the perturbed system w satisfies the 


equation 
in = (ip Vy. (55.2) 


The method of approximate solution of this equation was worked out by 
Dirac and is often called the Dirac perturbation theory or the method of 
variation of constants. The state of a perturbed system depends on time and 
its energy is not a constant of the motion. Our problem now is not to find the 
stationary states of the perturbed system, because they do not exist, but the 
calculation of the time-dependent wave function of the system. Hence the 
perturbation theory method must be modified. The solution of eq. (55.2) in 
the method of variation of constants is written in the form of an expansion in 
terms of the eigenfunctions of the unperturbed problem 


Wee, )= È (YPC, 1). (55.3) 


Since the wave functions yY, t) form a complete system of functions, 
such an expansion is always possible. The coefficients c,(t) of the expansion 
are functions of time only and not of the coordinates. Substituting the 
expansion (55.3) into eq. (55.2), we obtain 


j de dy, t) pne 
ih Zz ($ WPa, t) ra) = 2 clo + AN) YPE, t). (55.4) 


We multiply eq. (55.4) from the left by yo" (x, t) and integrate over all 
space. Then, taking into account eq. (55.1) and the orthogonality of the wave 
functions of the unperturbed system W(x, t), we have 

den 


Ear. =% H mk EXP iW pt) CK » (55.5) 


§55 TIME-DEPENDENT PERTURBATIONS 221 


where H,,,, is the matrix element of the perturbation operator 
Aink =f EI AYO) dV 5 

and (55.6) 
mk 7 he (Em = Ex) 5 


The system of equations (55.5) is exact. It is equivalent to the initial equation 
(55.2), since the whole set of coefficients c, completely determines the wave 
function Y. However, it is clear that the solution of the infinite system of 
equations (55.5) is no simpler a problem than the solution of the initial equa- 
tion (55.2). Hence for a simplification of the system of equations (55.5) we 
have to make use of the fact that the perturbation acting on the system is 
small. We assume initially that for t£ < O the system was in a state with the 
wave function yO. Then for ż <0 all coefficients in the expansion (55.3), 
with the exception of the coefficients with the index n, are equal to zero, ie. 


cp(0) = 8 pn - (55.7) 


Beginning with time ż=0 the system undergoes the action of a small pertur- 
bation. We assume that owing to the weakness of the perturbation the wave 
function of the initial state, y,,(0), changes little with time. Correspondingly, 
we seek the coefficients c,(t) at an instant of time ¢ > 0 in the form 


clt) = Ot) + CMD) + CPO + ... (55.8) 
where 
P(t) = €4(0) = pg - 

The correction Wr) is of the same order of small quantities as the pertur- 
bation, cl) (t) is quadratic in the perturbation and so on. Substituting the 
expansion (55.8) into eq. (55.5), we find 

def D # z 0) ' , 
dt F k H mk exp(iw mgt) cf = H mneXPp (nf) z (55.9) 





1 


Here all terms of the second and higher order of small quantities in the pertur- 
bation have been dropped. Integrating (55.9), we get 


OOS es ~ oja mn &XP (inp t)dt. (55. 10) 


In an analogous way one can find the corrections to cO of the second and 








— 


' 


i 
| 
g 
f 


222 PERTURBATION THEORY Ch. 7 


higher orders of small quantities. For example, for the correction of the 
second order c®) one easily obtains the expression 


m 


t 
Oea , i 55 
Cm = Fh 2 opar. (55.11) 


If the perturbation is sufficiently small, then one can restrict oneself to a 
small number of terms in the expansion. Thus, in principle, the wave function 
at any instant of time £ > 0 can be found with the desired degree of accuracy. 


§56. The transition of a system into new states under the action of 
perturbations 


We have found that if a system in a definite energy state for £ < O des- 
cribed by the wave function yO is acted upon by a perturbation A’), then 
for r>O the system turns out to be in a new state with the wave function 
(55.3). This means that for £ > 0 the system can be found in any of its possi- 
ble stationary quantum states; the probability of finding the system in a 
certain quantum state m is defined according to the general rules of quantum 
mechanics by the value of the quantity Tomes Since at the initial instant ¢ = 0 
the system was in the mth stationary state, then, consequently, le (Ol? 
defines the probability of the transition of the system from the nth state into 
the mth state in time £, W,,,,(¢) = len (OI? = [cmn (t)l?. Here we have denoted 
the initial state of the system by the second subscript. 

Thus the perturbation turns out to give rise to the transition of the system 
from one quantum state into another. A characteristic property of this 
process, which has no analogy in classical physics, is the fact that a given per- 
turbation gives rise to the transition of the system from a stationary state 
with definite energy into a new state in which the energy has no clearly 
defined value. This is often understood in such a way that under the action 
of the perturbation the system goes over by a discontinuous process into one 
of the possible energy states. The state into which the system goes will be a 
matter of chance. However, such an idea is incorrect and contradicts the 
physical basis of quantum mechanics. As a matter of fact, the final state is 
described by a wave function w and is hence a definite state (in the quantum- 
mechanical sense).- 

The transition from the initial to the final state is not carried out discon- 
tinuously, but proceeds in time. Indeed, as we shall see below, the transition 


§56 TRANSITION OF A SYSTEM 223 


probability is determined by the character of the perturbation and by its 
dependence on time. 

Transitions from a discrete into a continuous spectrum are of the greatest 
interest, and we shall consider such transitions in what follows. 

To determine the transition probability it is evidently necessary to. know 
the dependence of the matrix element of the perturbation operator Aia 0 
time. Here we characterize a state in the continuous spectrum by the EN U. 

Let us consider, first of all, the important case where the perturbation 
operator is a harmonic function of time. Then the matrix element of the 
perturbation operator (taken with respect to time-independent unperturbed 
wave functions) is also a harmonic function of time, i.e. 


A= = Hp (0) coswt . (56.1) 
We shall assume that the frequency w satisfies the relation 
hw > Eg- EO Fi 


where Eg denotes that energy value of the system with which the continuous 
spectrum begins. Substituting (56.1) into (55.10), we find 


De lee; 
cO my 1,0) | 


exp [i(w,,, + w)r] —1 exp li vn- —w)t]— 1 
: + ae 





] . (56.2) 


Wyy FW On =e 


Here we denote the initial state of the system by the second index in c0. 
Since the continuous spectrum lies in a range of energies higher than the 
discrete spectrum, then w,„> 0. From the structure of expression (56.2) it 
follows that for w,,,*« the denominator of one of the terms is close to 
zero. Transitions into states for which the condition w,,,~w is fulfilled 
occur with a low probability. From what follows it will be seen that the 
probability of the transition into such states increases linearly with time. 


Hence, dropping the first term in formula (56.2), we have 








i = =i 
AO ies; exp [i(w,,, —w)¢] 
Cun Th H n(0) Wy, — W (56.3) 
i 1);2 
Correspondingly for the square of the modulus Je]? we obtain 
21 
p sinz (wW, y — w) t TIH, (0)|? 
oD)? = H0 — "=" fa, i), (56.4) 
4h Elwyn- w) 4h? 


where 


sin? at 





a= 7(W,,—w) and f(a, t)= J 
TAİ 











aii pat aoe 


eis WOR p 


224 PERTURBATION THEORY Ch. 7 


Usually in practice it is of interest to know the magnitude of eD for 
large values of time ¢ (we recall that the instant of switching on the PORU 
tion is taken as the zero of time ¢ = 0). Therefore it is necessary to consider 
the behaviour of the function f(a, t) when t >œ. It is easily seen that for 
a+#0 and t> œ, f(@, t)> 0. Fora=0, f(0, t) = t/r and increases indefinitely 
with increasing time 7. Finally, integrating f(a, t) over all values of a, we find 





a sin?at y P sin? Ze ay 
ff == j ~dx=1. (56.5) 
—oo ao — o0 


Comparing the above properties of the function f(a, t) with the properties of 
the 5-function, we see that they are identical (see Appendix II). Thus 

lim f(a, t) is one of the possible concrete forms of the 6-function, and we 
{> œ 


can write 


; 2 -w 
n sin^ at pos U 
t imi mat mele ay (= 2 ) i 


Substituting this expression into formula (56.4) and making use of the known 
properties of the 5-function (see Appendix III, Vol. 1), we obtain 


0); 
led? = 2 thoy? s (S) = 


2 |H',,(0)|2 18 (E, -EO - fiw) . (56.6) 


Formulae (56.4) and (56.6) will be valid only under the condition that the 
probability of the transition from the given nth state into any vth state is low, 
i.e. 


feai. 


Only in this case is the initial assumption of the smallness of change of the 
wave function of the initial state fulfilled. Since the transition probability 
increases linearly with time, it is necessary, for perturbation theory to be 
applicable, that the time of action of the perturbation ż be not too large. 
Therefore we shall find out what conclusions can be drawn concerning the 
probability of a transition in a finite time interval ż. For this we study formula 
(56.4) without passing to the limit £ >, i.e. we consider the behaviour of 
the function f(a, t). The plot of this function with respect to time is shown in 
fig. V.17. 





§56 TRANSITION OF A SYSTEM 225 
f(a,t)4 
_3m _2m H o x an è M a 
t t t t t t 
Fig. V.17 


From the form of the function f(a, t) it follows that, in the main, transi- 
tions are realized into those states for which the quantity a lies within the 
limits of the principal maximum, i.e. Aa ~ t~ l. The straggling of the values 
of the parameter a determines the straggling of the energy values of the final 
state of the system 


AE, ~hAa~h/t. (56.7) 


Thus we arrive at the conclusion that in time ¢ the system can, under the 
action of perturbation (56.1), make transitions into states with energy 
E,, = EO +hw + AE,, where AE, ~ h/t. 

The uncertainty in the energy of the final state is AE, > 0 as tr > œ. We 
note that, proceeding from the uncertainty relation for time and energy (see 
§34), just such a value of the uncertainty in the energy of the final state 
AE „~ h/t was to be expected. 

From the requirement that the uncertainty in the energy of the final state 
AE, be small in comparison with the energy hw the following inequality 
arises: > w!. Consequently, AE, <hw if the time of action of the pertur- 
bation is large as compared to the period of perturbation. 

The transition to the 5-function in formula (56.6) means that the time ¢ 
must be sufficiently large for the uncertainty in the energy of the final state 
to be disregarded, but that nevertheless the condition of applicability of 
perturbation theory is still fulfilled. 

Formula (56.6), containing the 6-function, has, of course, a meaning only 


eee 























ree ie 


| 





226 PERTURBATION THEORY Ch. 7 


because integration with respect to the argument of the 5-function is subse- 
quently implied. 

We note that the conditions of applicability of perturbation theory are 
violated in considering transitions in a discrete spectrum in the so-called 
resonance case, i.e. when knl ~ ~ w. Under these conditions the corrections 
to the wave function yO become large and the problem must be solved 
precisely*. 

The probability of the transition per unit time from a quantum state with 
energy EO into a state of the continuous spectrum in the interval dv is 
defined iy the formula 

Wey, = + Leb? dv = 35 |H, Hi,,(0)I25(E,- EO —hw)dv. (56.8) 
In this case the wave functions of the continuous spectrum must be normal- 
ized to the 6-function in the v-space. Formula (56.8) shows that under the 
action of a perturbation harmonically dependent on time the system may 
carry out transitions only into states with energy £, = £0 + hw. 

The transition probability is defined by the square os the matrix element 
of the perturbation operator and depends, of course, on the choice of quanti- 
ties characterizing the state of the continuous spectrum. The energy of the 
particle is often chosen as one of the parameters characterizing a state in the 
continuous spectrum. Then, integrating with respect to other parameters, we 
have 


dW en =55, Hen O P(E) BE — Ey) — hw) dE , (56.9) 


where p(£) dé is the number of states with an energy in the interval from £ to 
£+dE, and the following notation is introduced: 


1 2 du , 
dE f (Hig l? Sp = Heni? P(E) AE . 
Integrating over energy we find the soral probability of the transition per 


unit time from a state with energy EO) to a state of the continuous spectrum 
under the action of a harmonic perturbation: 


=< |Hpn(O)I20(E) , (56.10) 


* See L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965). 


§56 TRANSITION OF A SYSTEM 227 


where B=EO +ħw. We note that if, as distinct from (56.1), we denote the 
matrix element of the perturbation, introducing exponential functions, by 


H onl) =H p0) CO! + ei) , 


vn 


then the numerical coefficient in formulae (56.8)—(56.10) would evidently 
change by a factor of 4. For example, formula (56.8) would then be rewritten 
in the form 


QM hays a aigt 
Wi. |, (0)|26 (E,, — EO — hes) dv (56.8') 


and formulae (56.9) and (56.10) change analogously. 

Another particularly important case is the transition caused by a time- 
independent perturbation. The expression for the transition probability can 
be obtained from formula (56.8) by setting the frequency in it to w = 0 and 
doubling the matrix element of the perturbation. This is associated with the 
fact that a term, which for w=0 is the same as the term retained, was 
dropped in the transition from (56.2) to (56.3). For the transition probabili- 
ty we have 


Digi 
Wing = Hol E- Eu) av (56.11) 


A time-independent perturbation can give rise to transitions only to states 
with the same energy. In other words, it can cause transitions only between 
degenerate states. We have here denoted the initial state by the index uo, 
since transitions in the continuous spectrum are of the greatest interest in the 
case of the action of a constant perturbation. Of course, all the above reason- 
ing associated with the transition to the 6-function is also valid in this case. 

Integrating (56.11) over final state energies, we can write the transition 
probability in another form: 


20 


dW, = 


2a p du = ' du 
ge UAE is E ea els Se We pa (COND) 


We write the total transition probability, analogously to (56.10), in the form 
Qi ere 
m= |HEvl PE) , (56.13) 


where E SPa: Let, for example, the final state be characterized by defining 
the momentum of the particle, so that 


du = dp ydp, dp, = p?dp dQ = pm dE dQ, 





228 PERTURBATION THEORY Chay 


where £ = p2/2m is the energy of the final state of the particle, and dQ is an 
element of solid angle. Formula (56.12) is, in this case, rewritten in the form 


5 
dw = IHi, |2mpdQ, where p= (CME)? -< (5614) 


Puo ~ pvo 


Here the wave functions of the final state must be normalized to the ô- 
function in momentum space. 

In another method for the normalization of these functions, for example 
normalization in a “box” (see (26.16) and (26.17)), the interval of final states 


du’ will have the form 
dp,.dpydp,V 
mh)? 

Of course the expression for the transition probability (56.14) will not 
change in this case. Finally, we note that expressions (56.11)—(56.14) 
depend on the method of normalizing the wave function of the initial state, 
which also belongs to the continuous spectrum. 

The matrix element of the perturbation operator very often turns out to 
be equal to zero. In this case the transition probability reduces to zero. This 
means that the corresponding transition is impossible in the first approxima- 
tion of perturbation theory. In the next higher approximation the probability 
of the corresponding transition may turn out to be different from zero. 

Let us find the probability of transition caused by a time-independent 
perturbation in the second order approximation for such a case. 

Formula (55.11) gives 


du’ = dn, dnydn, = (56.15) 


t 
PA) — ] , 1 : 
O D | ah! \(t) exp liwt) de . (56.16) 


The sum (or over the continuous spectrum, the integral) involved here is 
taken over intermediate or, as they are often called, virtual states, so that the 
transition itself can be treated as a transition via intermediate states. It should 
be stressed that the transition of the system via intermediate states is not a 
real physical process, but serves only as a way of dealing with the formulae. 
Hence, for example, in transitions into virtual states the energy of the system 
does not need to be conserved. Substituting the expression for cù) from 
(56.2) (w = 0) into (56.16) and integrating, we obtain 


exp low!) —1 exp (iw,,t)— ‘| 


kay vv kup Wok 





—)— J 1 pyi 
ay ae 2 Higin |. 
(56.17) 


§57 THE ADIABATIC THEORY OF PERTURBATIONS 229 


Since, by assumption, transitions are absent in the first approximation of 
perturbation theory, the matrix element of the perturbation operator coe 0) 
for transitions proceeding with energy conservation W,,. = 0. In correspon- 
dence with this those intermediate states for which wg, = O give no contri- 
bution to the amplitude (56.17). For transitions proceeding with energy 
conservation* (Wy = 0) the second term in the bracket of formula (56.17) is 
not large. Indeed, it might give an appreciable contribution only for w, = 0. 
But Wry. = Wuk t Cku9> and for Owe = O and w, = 0 it turns out that also 
Wyo reduces to zero. For such transitions leh = 0 and, consequently, they 
can be disregarded. Proceeding from this, we can rewrite (56.17) as 





exp (iw,,,, 0) — 1 
(aye! “vuo a 
(NETS AN (56.18) 
v ~~ A Sw or, 
where 

H', H; 

vuk**k 
Nyy = 2s “ve (56.19) 





vuo k Ew = Ey 


(integration is implied over the continuous spectrum). 

We see that in the notation in the form of (56.18) the expression for the 
amplitude ce?) is the same as (56.2) (w = 0). Therefore the results obtained, 
in particular the formula for the transition probability (56.8), are conserved 
under the condition that the matrix element loli be replaced by the matrix 
element Awg: 


§57. The adiabatic theory of perturbations 


In certain cases the perturbation acting on a quantum system is associated 
with a slow, adiabatic change of the parameters on which the state of the 
system depends. 

In the case of an adiabatic change of certain of the parameters which 
characterize a system it turns out that it is possible to develop a special 
approximate method of calculation called the adiabatic theory of perturba- 
tions. We shall encounter this method below in studying the properties of 
molecules and solid bodies. In such systems there are particles of two kinds: 


* The possibility of transitions with non-conservation of energy is associated with the 
assumption of suddenness of switching on the perturbation (see (56.1)). For a more 
detailed discussion of this problem see, for example, L.Schiff, Quantum mechanics 
(McGraw-Hill, New York, 1949). 








230 PERTURBATION THEORY Chie, 


light electrons moving with large velocities, and heavy nuclei performing 
relatively slow movements. We shall call the electrons of the system the fast 
sub-system, and the heavy nuclei the slow sub-system. Roughly speaking, the 
characteristic time needed for a change of state of the fast sub-system is very 
small in comparison with the corresponding time for the slow sub-system. The 
essence of the adiabatic theory of perturbations amounts to the fact that the 
motion of the fast sub-system is considered in the first order approximation 
for given coordinates of the slow sub-system. 

In other words, the motions of the fast and slow sub-systems are to a 
certain degree independent. 

Let us consider the motion of a system consisting of electrons and nuclei. 
We write the Schrödinger equation in the form 


n2 2 te 2 ie a 
|-= 2 V; Soe Vg + U(r,. Rj) | W(t, Ry) = EW(t,, Rj) - Gal) 


Here m and M are the masses of the electrons and nuclei respectively. The 
sum over k is carried out with respect to the coordinates of the electrons, 
while the sum over / corresponds to the coordinates of the nuclei. U is the 
operator corresponding to the mutual interaction energy of the particles. 

Further, we assume that it is possible to find the solution of the following 
Schrödinger equation: 


h2 ‘ 
[- Se 2 VÈ + U(r, r) | Opty» R) = En(RPn(ty, R) - (57.2) 


Equation (57.2) has the following physical meaning. The nuclei are assumed 
to be fixed at points R;. Finding the solution of eq. (57.2) comes down to the 
determination of the electron wave function y,, and the energy levels of the 
electron system. As is seen from eq. (57.2), the energy levels £„(R;) of the 
electron sub-system depend on the coordinates of the nuclei (the heavy sub- 
system) as parameters. 

Geometrically the electron energy £,,(R;) forms a certain surface in space 
R;. This surface is called the electron term. 

We write the solution of eq. (57.1) in the form of an expansion in a series 
in terms of the complete system of wave functions y,, 


Virk R) = 2s a(R) onlte R;) (57.3) 


We substitute (57.3) into (57.1), and then multiply eq. (57.1) by y,, and 
integrate with respect to the coordinates of the electrons dV=dV,dV .... 


§57 THE ADIABATIC THEORY OF PERTURBATIONS 231 


Taking into account the formula 
2 = 2 2 2) . 
Vin Pn = PnV On tOn V7 Yn + 2V;i%n VO, 


we find the following equation: 


n2 > 2 
Sore ade a, +E, (R)a,, = 


= Ea, ye 22s Be an f OmV Mn AV + 


TIEA Pn Vi a, av]. (57.4) 


Here V; is calculated with respect to the coordinates R; of the nuclei. We 
rewrite eq. (57.4) in the form 


h? 2 : A coum 
f T z v+ ER] Qn (Rj) = Eam (R) + CO, » (57.5) 
where the operator C is defined in the soerg way: 
A 
Can = zu 2 e Yi 1% J omy nav +o a = an fony vinar). (57.6) 


a 
The operator C is called the non-adiabatic operator. 


A 


If one assumes that the operator C is small and neglects it in eq. (57.5) 
then the equations for the functions y,, and a, assume the form 


t2 2 

|- on 2 Vk t Ulrk, r] Ym = Em RÌ Ym » (57.7) 
n2 

[- a 2 v 7+ ER) |e Oy, = Eð « (57.8) 


Thus we obtain an important result in the zero order approximation with 
respect to the operator G Equation (57.7) represents a Schrödinger equation. 
The coordinates of the nuclei are involved in this equation as parameters. The 
function y,,(r,,R;) describes the motion of the electrons for motionless 


nuclei. Equation (57.8) contains only operators acting on the coordinates of 


the nuclei. It can be considered as the Schrödinger equation for the heavy 
sub-system, the nuclei. Then the energy £,,,(R,) of the electron sub-system 
plays the role of the potential energy of the nuclei. 

The total wave function of the system in the zero order approximation 


ani i © 








232 PERTURBATION THEORY Ch. 7 


A 
C=0 can be written in the form of a product of the wave functions œ„, and 
Ym: i.e. it has the same form as if the two sub-systems were quite indepen- 
dent: 


V = am R) Pn (Fx, R) - 


In the approximation described it can be said that the electron sub- 
system follows the motion of the nuclei adiabatically in the sense that the 
electron sub-system remains in the same quantum state £, when the position 
R; of the nuclei is changed. However, its energy level £,,, changes in corres- 
pondence with the motion of the nuclei. x 

The condition of the smallness of the operator C cannot be formulated in 
general form. In every actual problem this condition must be considered 
separately. Examples of such a consideration can be found in the books of 


Pauli, and Born and Huan Kun*. 


§58. Perturbation theory in integral form 


Perturbation theory can easily be developed within the framework of the 
Feynman formalism**. For this it is convenient to use as a basis the integral 
equation (29.5) for the Green’s function K(ryt7; 1, ¢,) which we shall denote 
by K(2, 1) 


co 


KQ,1)=Ko(2, 1) - 7 if K (2, 3)H'()KG, 1) d4x3. (58.1) 


Here we have denoted the Green’s function of the unperturbed problem 
A =A, H' = 0 by Ko(2, 1). 

Making use of the smallness of perturbation, we solve eq. (58.1) by a 
method of successive approximations. In the zeroth approximation, i.e. 
assuming 7’ = 0, we have 


KQ, 1)=Ko@, 1). (58.2) 


* W.Pauli, Die allgemeinen Prinzipien der Wellenmechanik (General principles of 
wave mechanics), Handbuch der Physik V/1, 1958; M.Born and Huang Kun, Dynamical 
theory of crystal lattices (University Press, Oxford, 1954). 

** R.P.Feynman, Phys. Rev. 76 (1949) 740. See also S.Schweber, H.Bethe and F. 
Hofman, Mesons and fields (Row, Peterson and Co., Evanston, Ill. and White Plains, 


N.Y., 1956). 


§58 PERTURBATION THEORY IN INTEGRAL FORM 233 


We shall obtain the next approximation if we substitute into the integral 
(58.1) the zeroth order approximation of the Green’s function K(3, 1), i.e. 


KDQ, 1)= -> [Ko(2, 3)H'B)K 9B, 1)dtx3 (58.3) 


To obtain the correction to the Green’s function in the second order 
approximation, we have to substitute into integral (58.1) the function K(3, 1) 
with an accuracy to within terms of the first order of small quantities: 


i\2 A A 
K(2,1)= (=) J[Ko2.3)H'G)K (3, 4) A'(4)Ko(4, 1) d*x3 d4x4. 
(58.4) 


The correction to any order can be obtained in an analogous way. Finally, we 
have 


KQ, 1) = Ko(2, 1)-7 J Ko, 3)A'B)Ko(3, 1 d4x3 + 


i \2 A x 
: +(-+) JKo2.2)H'G)Ko3, YH AKA, DO 
58.5) 


Formula (58.5) can be interpreted in the following way. The zero order term 
describes the motion of the unperturbed particle from point | to point 2. The 
next term describes the motion of the free particle from point 1 to point 3. 
At point 3 a perturbation acts. Thereupon the particle, again as a free particle, 
moves from point 3 to point 2. The integration means that we sum the contri- 
bution of all possible points 3. The term of second order smallness takes into 
account the action of the perturbation at two points, 3 and 4, and so on. By 
calculating the Green’s function K from eq. (58.5) to a given approximation, 
we also know the wave function in this approximation. The convenience of 
the integral equation (58.1) lies in the fact that it makes it possible to obtain 
very simple a perturbation theory series. Examples of the use of the integral 
from the perturbation theory will be considered in ch. 14. 


-as 


l 











Spin and Identity of Particles 


§59. The spin of elementary particles 


Up to now we have assumed that the state of an individual microparticle is 
defined if its three space coordinates, or three momentum components, or in 
general three quantities forming a complete set are known. A number of ex- 
perimental results indicated that many microparticles, for example electrons, 
protons, neutrons, have a specific intrinsic degree of freedom. This intrinsic 
degree of freedom is associated with an intrinsic angular momentum of the 
particle which does not depend onits orbital motion. This angular momentum 
of the particle is called the spin. The fact that the electron has a spin was 
established before the development of quantum mechanics. Attempts were 
made to interpret the spin as a manifestation of the rotation of a particle 
about its own axis (whence its name arose). However, this classical interpreta- 
tion turned out to be untenable. All attempts to obtain the correct value of 
the ratio of the angular momentum to the magnetic moment for a system of a 
distributed rotating charge failed. As for the model of a rigid rotating sphere, 
(for which any value of this ratio can be obtained), such a model, as was ex- 
plained in §13 of Part II, contradicts the general propositions of the theory 
of relativity. This contradiction was resolved in quantum mechanics. As we 
shall see below, this intrinsic degree of freedom and the spin associated with 
it have a specific quantum character. In the transition #0 to classical 


234 


§59 THE SPIN OF ELEMENTARY PARTICLES 235 


mechanics the spin reduces to zero. Hence the spin has no classical analogue 
and does not allow interpretations of a classical character. The hypothesis of 
the existence of spin was initially put forward in connection with the inter- 
pretation of the spectra of alkali metals. Subsequently a number of facts were 
established enabling one to state unambiguously that this hypothesis is 
correct. 

In the experiments of Stern and Gerlach the magnetic moment which is 
not associated with the orbital motion of electrons was observed directly. 
Namely, in these experiments it was established that when a beam of hydro- 
gen atoms in an S-state was passed through a non-uniform magnetic field, 
then this beam split into two beams. However, in the S-state there is no 
orbital angular momentum, and consequently no orbital magnetic moment, so 
that the beam should pass through the magnetic field without undergoing any 
deflection. 

The two-fold splitting is indicative of two possible orientations of the 
magnetic moment of the electron. The value of the spin magnetic moment 
can be determined from the magnitude of the splitting. 

Direct experiments carried out by Einstein and de Haas made it possible 
to determine the ratio of the intrinsic angular momentum to the magnetic 
moment. 

The spin of the electron (the intrinsic angular momentum) possesses the 
general properties of a quantum-mechanical moment which were discussed in 
§51. This was proved rigorously by the mathematical technique of group 
theory. In particular, the eigenvalue of the operator of the square of the spin 
moment §2 = 62 a 5? + 52 is expressed by the formula 


s(st1)h2 , (59.1) 
where s denotes the corresponding quantum number, the intrinsic or spin 
quantum number of the particles. This quantum number is often called 
briefly the spin of the particle. 

The number of possible spin projections onto an arbitrarily oriented z- 
axis is equal to 2s + 1. The value of the intrinsic number s for each elementary 
particle must be determined experimentally. For the electron the existence 
and the value of the spin follows strictly from relativistic quantum mechanics, 
(Dirac’s theory) to which ch. 13 is devoted. 

The spins of the elementary particles which are most often encountered 
are the following: for the electron s=4, for the proton and neutron s= }, 
for the 7-meson s = 0, for the muon s = 4. This means that the possible values 
of the projections of the intrinsic angular momentum onto an axis arbitrarily 





| 
| 
H 
| 





SPIN AND IDENTITY OF PARTICLES Ch. 8 


oriented in space, for example for the electron and other particles with spin 3, 
are Á 


Sn (59.2) 


From the experiment of Stern and Gerlach, the corresponding projections 
of the intrinsic magnetic moment of the electron are equal in absolute value 
to the Bohr magneton uo 


zilei 
=F; = r E 
Hz =F Zma FHo (59.3) 
It is of great importance that the ratio of the intrinsic magnetic moment 
to the intrinsic angular momentum is equal to e/mc 
e 
=— 59. 
B= aeS’? ee) 
whereas for the orbital motion this ratio is smaller by a factor of 2 (see §63). 

In §118 we shall show that this value of the intrinsic magnetic moment 
can be derived theoretically from Dirac’s relativistic wave equation. 

The spin properties of elementary particles play a very important role in 
the realm of microphenomena as well as in the behaviour of macroscopic 
bodies. This is associated with the fact that the spin determines directly the 
statistical properties of systems of quantum particles. 


§60. Spin operators 


Although, as we shall see below, the existence of spin for the electron 
and all the properties associated with it can be established theoretically from 
the propositions of relativistic quantum mechanics, a number of the proper- 
ties of particles having spin can also be obtained without referring to rela- 
tivistic theory, on the basis of general quantum-mechanical considerations 
and a relatively small number of experimental data. Since such a semi-em- 
pirical theory of particles with spin has a rather simple character but still 
makes it possible to obtain important results, we shall dwell on it below. 

The wave function of a particle with spin depends not only on its three 
spatial coordinates but also on a fourth coordinate characterizing the intrinsic 
state of the particle. The value of the spin projection onto an arbitrarily 
oriented z-axis in space can be chosen as the fourth coordinate. Then the 
wave function can be written in the form 


§60 SPIN OPERATORS 237 
Y = W(x,s,,0)- (60.1) 


As distinct from the spatial coordinates x, the ‘spin coordinate’ s, takes 
on omy a discrete sequence of values. The number of possible values of s, is 
determined by the properties of the given elementary particle. As was men- 
tioned above, the spin of most elementary particles is equal to one half. Since 
in this case the spin projection can take on only two values, the wave func- 
tion (60.1) is conveniently written in the form of a column with two rows: 


Wez 4h, 9 vı 
p= = 5 60.2 
K (ee — 3h, a) (ta) ( ) 


We can then interpret |, |2dV as the probability that the electron at the 
instant of time ¢ will be found in the volume element dV and that it will have 
a spin component along the z-axis equal to 5h. Correspondingly, |W |2dV is 
the probability that for the electron found in the volume element dV the spin 
component along the z-axis is equal to —3/. The wave function W(x,s,,f) is 
assumed to be normalized in such a way that 


Df Wes, oP = 1, 
Sz 


where the sum is taken over all possible values of the spin projection s,. If the 
probability of one or another spin projection does not depend on the coor- 
dinates of the particle, then the wave function (60.2) can be rewritten in the 
form 


Y= pix, t, (60.3) 


where wW(x,f) is the ordinary (coordinate) wave function, y= (2) is the spin 
wave function, and c} and c3 are numbers. |c] |? and ley (2 give the probabili- 
ties that the spin projection s, is equal respectively to +3h and —th. 

By virtue of the normalization condition for the wave function, we have 


Icy? + Iegl? = 1. (60.4) 


Having defined the concept of the spin wave function, we have to intro- 
duce the spin operators acting on it. In general the action of an operator on 
the spin function y= C amounts to replacement of the components c} 
and c3 by some linear combination of them 








238 SPIN AND IDENTITY OF PARTICLES Ch. 8 
cy 411C] +4]2C2 

( J ) (60.5) 
c2 421C] +4222 


Corresponding with this the spin operator can be written in the form of a 
matrix 


sag 212 
a=( ms ). (60.6) 
Coy Coy) 
The action of such an operator on the wave function is given by formula 
(60.5), i.e. 
a (711 212) (1 \_ (2111 t4202 
ag = F y (60.7) 
a2) 222/\C2 a21C] +4222 


If the division of the wave function into a coordinate component and a spin 
component is not allowed, then formula (60.7) is rewritten in the form 


ay=("" Ea ee.) (60.8) 
a2) an2/\W2/ \an,¥y + 422¥2 


The mean value of the operator @ taken in the state Y is determined ac- 
cording to the general formula (44.8) 


O(x,t) = Yia Vy + Via 2W2 + W3az W + W3az2W2.- (60.9) 


This equation can be rewritten in matrix form 


a(x,t)= yiay , (60.10) 
where Wt is the matrix consisting of one row with the elements yj and y3: 
yi =(Vj¥5)- (60.11) 


Relation (60.10) determines the mean value of the quantity a at the in- 
stant of time f at the given point of space x. If this expression is averaged 
over all positions of the particle, then we obtain 


a(t) = J viawdV. (60.12) 


We now introduce the operators corresponding to the spin components S,, 
S, S,- In §51 it was shown that the form of these operators and all the 
properties of the spin can be obtained if the commutation relations 





§60 SPIN OPERATORS 239 


5,5), — Sy 8, = is, , 
5,8, — 8,3, = ins, , (60.13) 
SS = Soe ins, 3 


are taken as the basis. The fact that the spin component operators must 
satisfy the same commutation relations as the operators of the components 
of the orbital angular momentum is, of course, not accidental. In §30 it was 
shown that the operator corresponding to the component of orbital angular 
momentum along any axis is associated with the operator corresponding to an 
infinitesimal rotation about this axis. The commutation relations (30.3) and 
(30.3') are a consequence of this fact, i.e. a consequence of the commutation 
relations between the infinitesimal rotation operators. In the next section we 
shall show that the spin component operators are also associated with rota- 
tion operators which act, however, not on the coordinate function but on the 
spin function. The commutation relations (60.13) are a consequence of the 
commutation relations between the infinitesimal rotation operators about the 
X-axis, y-axis and z-axis. The above considerations are rigorously substantiated 
by the methods of group theory*. 
By analogy with (30.4), it follows from the relations (60.13) that 


C= Were 

SuSee Saori On 

SiS = S26, =0, (60.14) 
BP Oe = 

S58 =S 5, -U, 


Thus the square of the total spin and one of its projections onto an arbi- 
trary axis can be measured simultaneously. The other two projections have no 
simultaneously sharp values. 

For s=4 the matrices corresponding to the total spin, $? = $2 +8 + $2; 
and its projection on the z-axis have, in their own representation, the form 


1 0 a 1 0 
el a) sanl J foz» (60.15) 


$2 = 


Aw 


(the diagonal matrix elements are equal to the eigenvalues of the correspond- 


a 


ing operators). According to the general formula (51.16), the matrices $,, Sy 
in this representation are written as 


*W.Pauli, Die allgemeinen Prinzipien der Wellenmechanik (General principles of wave 
mechanics), Handbuch der Physik V/1, (Springer, Berlin, 1958). 








240 SPIN AND IDENTITY OF PARTICLES Ch. 8 


er 0 1 j ae 0 -i A 
Sx = pia a 2^0 ; Sy = 1n( is zho, % (60.16) 


The matrices 0%, 0,, 6,, Which differ from the matrices Sy, $,, $, by a con- 
stant factor 4/1, are called the Pauli matrices. They satisfy the following com- 
mutation relations: 


Oxy = —9, 0% =10,, 


0,0, = —0,0, = io, , (60.17) 
0,0, = —0,0, = i0, , 
1 0 
D aA )- 
O40 07 — =1. 
eee (6 1 


An arbitrary matrix of the second rank can be expressed in terms of the 
matrices 0,.,0,,0, and the unit matrix. 

As well as the similarity between the orbital and intrinsic angular momenta 
there is-also a basic difference between them. The orbital angular momentum 
is characterized by the quantum number / which can take on any integer 
values irrespective of the nature of the particle, whereas the spin quantum 
number s takes on a limited number of values, for example s =} for most 
elementary particles. Every kind of elementary particles has its own charac- 
teristic value of the spin. If the transition to classical mechanics is made by 
assuming A > 0, then, as was explained in §41, one must pass simultaneously 
to the limit of large quantum numbers. Hence, although according to formula 
(30.15) 12 =A2/(/+1), it still does not follow that as ñ> 0 k> 0, because 
simultaneously with A >O one has to assume that /> ee. In the case of the 
intrinsic angular momentum the situation is different. Since s takes on only 
a limited number of values, the transition to classical mechanics always leads 
to the spin value s = 0. We see that in classical mechanics there is no quantity 
which could serve as the classical analogue of the spin. The spin is a purely 
quantum concept characterizing the specific properties of microparticles. 


§61. The eigenfunctions of the operators of the components of the spin of a 
particle. The rotation matrix 


Let us find the eigenfunctions and eigenvalues of the operators om Sy S 
The equation for the eigenfunctions (ce) and the eigenvalues s, of the 
operator $, has the form 


§61 EIGENFUNCTIONS OF SPIN COMPONENT OPERATORS 241 


ne SV cy 
Sy = Sy å 
C2 C2 


Taking into account (60.16) and carrying out the multiplication, we obtain 
(ec) -( Nei ) 
she, SyCQ 
We evaluate this equality: 


Icy =S,C, , 3c, =SyC - (61.1) 


Hence, upon multiplying, we find s, = +3”. 

The eigenvalues of the spin component operator, as was to be expected, 
turn out to be equal to +}. We determine the form of the eigenfunctions 
corresponding to these eigenvalues. 

For s, =+}h% we have from (61.1) cy = c3. Taking into account the nor- 
malization condition (60.4), we finally get 


EE EA 
Psy=+h/2 BA (, ), (61.2) 


where a, is an arbitrary phase. 
Correspondingly, for s, = —th, cı =—€2, and the spin wave function is 
written in the form 


ling 1 
Psx=—ħ/2 R ( ). (61.3) 


Naturally, the eigenvalues of the operators $, and $, are also equal to +}A. 
We also find their eigenfunctions in an analogous way 


itse 1 i le 1 
Psy=h/2 F e!a3 ( ) Psy=—h/2 Roe (i ), (61.4) 


; 1 7 0 
Psz=h/2 = p; = els le ); Ps_=—h/2 =9_1 = el%s l ) (61.5) 


The arbitrary phase factors a; can, in particular, be set equal to zero. 
We now consider a certain rotation of the system of coordinates x,y,z > 
x'y',z'. Then the spin wave functions also change, y > y’. Indeed, the transi- 
tion from the coordinate system x,y,z to the system x’,y’,z’ means a corre- 





242 SPIN AND IDENTITY OF PARTICLES Ch. 8 


sponding transition from one representation to another. Such a transition, as 
we know (see §46), is carried out by means of a certain unitary matrix T, so 
that y'= Tọ. In the given case it is natural to call the unitary matrix 7 the 
rotation matrix. Let us define this matrix. We first consider a rotation about 
the z-axis through an angle n. According to (46.15), the operators of the spin 
components in the new representation have the form 


at Rt PA 1 

Sx = T;Sx1z > 

amt BH] 

5, = z$y 1z 5 (61.6) 
a a f! 

Sz = 475242 


Here se Sy, S are the operators of the spin projections onto the old coor- 
dinate axes x, y, z but taken in the new representation connected with the 
system x, y',z'. Since we consider a rotation about the z-axis, the z-axis and 
the z’-axis coincide and, consequently, the operators of the spin projections 
onto these axes are the same in the two representations (expressed by the 
diagonal matrix (60.15)). However, the conditions for the operator of the 
spin projection onto the z-axis to be chosen in the form of the diagonal 
matrix (60.15) are still insufficient for the definition of the operators of spin 
projections along other directions. The operator of the spin projection along 
an arbitrary direction will be known if we also give the operators $, and sy in 
the form of matrices, i.e. if we give the system of x,y,z coordinates with 
which we connect the representation. We can, in particular, choose the re- 
presentation connected with the system x’, y’, z’. In this representation 
(which we denote by primes) the operators of spin projections onto the x’- 
axis, y-axis and z’-axis $\., Sy 5, have the form (60.16), (60.15). 

Because the operators S,, $y, ŝ, correspond to the projections of the spin 
moment, when the coordinate system rotates they must transform as the 
projections of an angular momentum, i.e. as the components of an axial 
vector. Since we consider the rotation through an angle n about the z-axis, 
the operators of spin projections onto primed and non-primed coordinate 


axes are connected with each other by the relations 


S$, = Sy cos — Sy sinn, 
Sy = Sx sinn + Sy cos7 , (61.7) 
HEOR 


The equalities (61.7) hold in any representation. In particular, in the re- 
presentation -connected with the rotated system of coordinates we have 


§61 EIGENFUNCTIONS OF SPIN COMPONENT OPERATORS 243 


> a 


u 
ta) 


r A ar . ew a . 

x = Sy cosn — Sy sinn = Sy cosn — Sy Sinn , 
' 

7 


y sinn + sy cos n =$, sinn +3, cos7n , (61.8) 


o> 
Il 
ta 


z F s, 

Indeed, the two systems of coordinates, the primed and the non-primed, are 
completely equivalent. As we have already noted, the operators of the spin 
projections onto the axes of the primed coordinate system, taken in the 
representation which is just connected with this rotated system of coor- 
dinates, have the ordinary form (60.15) and (60.16), i.e. are the same as the 
operators of the spin projections onto the axes of the non-primed system of 
coordinates, taken in the representation connected with the non-primed 
system of coordinates: 


ar ar 


SAS Sy = Sy 3 SUIS 


Considering the equalities (61.6) and (61.8) together, we „find, the matrix 
Ti: First of all, it follows from (61.6) and (61 :8) that S, = Ti s, Te l , i.e. the 
matrix T commutes with §,. Since the matrix $, is diagonal, the miata it is 
also diagonal (see §47). Consequently, the matrix it. has the form 


a a, 0 
0 a 
The condition for the unitarity of the matrix T ie T= 717, = 1, leads to 
the equality 


(a 0 )-(, )) 
0 lal 0 1 


or |a} |? = 1 and |a3|? = 1. Consequently, the matrix Î, has the form 


2 eigi 0 

i =( 5 ore: (61.9) 
where a, and a, are real. We rewrite eq. (61.6) in the form 

Sela ose 

Gil = TS 
Substituting the values of $, and $i, from (61.8) into these expressions and 


equating the corresponding matrix elements, we find that the equalities are 
satisfied under the condition a, — a =n. 








244 SPIN AND IDENTITY OF PARTICLES Ch. 8 


Thus for the two phases a, and @ we have obtained one condition con- 
necting them. This fact is not accidental. The point is that the matrix 7, can 
contain an arbitrary phase factor which will not affect the results because the 
wave functions themselves are defined to within an arbitrary phase factor. In 
accordance with this we write the matrix 1 in the form 


$ erin 0 
Fon) =( ). (61.10) 
0 


—e7sin 
We can express the matrix T, also in terms of the matrix o, 
T,(n) = cosin + io, sin}n, (61.11) 
or in a somewhat different form 
T, = exp ind.) . (61.12) 


The above expressions should be understood as 


AaS i 1 [i 2 1 [i n 
1 =r 500z t5 512 Vantin 519, eae 


Since o2 =], o2 = g, and so on, the series is easily evaluated, and we again 
arrive at formula (61.11). If the angle of rotation is small, then the rotation 
matrix (61.12) has the form 
PETAS, (61.13) 
hh zZ 
where we have introduced the matrix $, = fo, instead of o,. We have ob- 
tained an expression which has the same structure as the rotation operator 
obtained in §30 which acts on a function depending on spatial coordinates. 

Of course, a relation of the type (61.11) is also valid for a rotation about 
any other axis, since all directions are equivalent. Thus, for example, we 
write the matrix Th of a-rotation about the x-axis through a certain angle ¥: 


T,,(0) = cos}9 + io, sin}d. (61.13’) 


The arbitrary rotation of one coordinate system with respect to another can 
be characterized by three Euler angles ¥, y, Y (fig. V.18). The angle w is the 
angle between the axis Ox and the straight line ON forming the intersection 
of the planes xOy and x'Oy’, 3 is the angle between the axes Oz and Oz’ and, 
finally, y is the angle between ON and the axis Ox’. The total rotation can be 
resolyed into three consecutive rotations: I about the z-axis through the angle 
W; II about the new position of the axis x(ON) through the angle 3, and III 


ai 


§61 EIGENFUNCTIONS OF SPIN COMPONENT OPERATORS 245 





Fig. V.18 


about the new position of the axis Oz through the angle Y. The rotation 
matrix 7 will be equal to the product of three matrices T = Ts (pf, (of, (4). 
Making use of (61.10) and (61.13') and multiplying the matrices, we find 


a (ey t)costd — iew?-¥) sind 9 
=( a re ) (61.14) 
ie¥-¥) sink 9 — e~2 FY) cosh 9 
This matrix was first obtained by Pauli. The two-component wave func- 


tion CN) which transforms, when the system of coordinates rotates, accord- 
ing to the law 


' 
c Gy 
eae 
c? 62 
is called a spinor. 
The spinor components which we have introduced are usually denoted as 


We note that for any given spinor y= (°!) one can always find a matrix 
T(9;9,W) such that y! = 1,9 2" = 0, i.e. it is always possible to determine the 


direction, characterized by the angles (y,0,W), along which the spin of the 
particle is oriented. 








246 SPIN AND IDENTITY OF PARTICLES Ch. 8 


We rewrite the matrix Î in the form 
a fee ip 
as) 
VARRO 


yl = ay! + Bo? i, 
g” = 79! + dy? , 


where, as follows from (61.14), 


so that 


(61.15) 


* 


a=65*,. B=-7*, ad —Bpy=1. (61.16) 


The transformation (61.15) is usually called a bilinear transformation. 
The bilinear transformation leaves certain bilinear forms invariant. Indeed, 
making use of (61.15) and (61.16) we easily obtain for two arbitrary spinors 
y and n 
pln” — y?n" = (05—By\(p!n?—-v?n) = on? — gn! , br 
yl y! +o? y2 = const. : 
The above relation expresses the conservation of normalization under the 


rotation of a coordinate system. 

By means of the spinor components n!, n? and ¢!, ¢2 one can construct 
quantities which transform under a rotation of the coordinate system as the 
components of a vector, i.e. according to the law 


3 
aj= 2 Qikik > 
kl 

where a, are the cosines of the angles between the old and new coordinate 
axes. By a direct check, making use of (61.14) and (61.15), one can verify 
that the vector components are determined by the relations 

a, = ($2425!) $ 

ay = (n?$?—n!$!) , (61.18) 

a, = —i(n'g! 4267). 


Correspondingly for the square of the vector we have 


§61 EIGENFUNCTIONS OF SPIN COMPONENT OPERATORS 247 


a? = a2 +a? +a2 = (nig? — 0751)? (61.19) 


We have, as was to be expected, obtained a scalar quantity. 

The components of a tensor of arbitrary rank Bj,,;_ can be defined as the 
product a;b;,c;... of the corresponding components of vectors. By means of 
formulae (61.18) we can identify the components Bj,,_ with the products of 
the components of spinors. 

Finally, we shall define the law of transformation of a spinor under the 
inversion of the coordinate system, i.e. under the transformation x > —x, 


y > —y,Z > -Z 


We denote the inversion operator by /, so that 


y'=. (61.20) 


The transformation (61.20) can now be considered as a transition to 
another representation. The corresponding transformation of the operators 
G S $, is analogous to (61.6). On the other hand, we can consider the 
operators $y, $p, $, as the components of an axial vector (as the components 
of an orbital angular momentum). Consequently, these operators must not 
change sign under reflection. Based on this, we obtain, in analogy with for- 
mula (61.6) 


SSIS iL, 
SS > (61.21) 
S So mars 


When the matrix / is applied twice we come back to the initial state and, 
consequently, we must obtain the spinor y. Furthermore, we can obtain the 
spinor —y if we consider a double reflection as a rotation through the angle 
27. As is seen from formulae (61.10) and (61.13’), under such a rotation the 
spinor changes sign. Correspondingly we have 


7 1 0 
2=+( J (61.22) 
0 1 


We see, consequently, that the matrix / must commute with the matrices 


se Sy, §,, and that its square must give the unit matrix multiplied by +1. 
These requirements will be fulfilled under the condition that 


F O A © © 
j=a( ) or = si( J} (61.23) 
0 1 @ i 








248 SPIN AND IDENTITY OF PARTICLES Ch. 8 


Consequently, under reflection the spinor can transform in the following 
way: 
gy =ty (61.24) 


or 
y'= tiy. (61.25) 


If the law of transformation is determined by the upper sign in formulae 
(61.24) and (61.25), then y is sometimes called a polar spinor; if the law of 
transformation is determined by the lower sign, then ọ is called a pseudo- 
spinor. 


§62. The total angular momentum 


The total angular momentum of a particle is made up of the orbital angular 
momentum and the spin angular momentum. According to the rules for the 
addition of vector operators we have for the total angular momentum opera- 
torj 

A A A 
j=l+s. (62.1) 


The orbital and intrinsic angular momentum operators act on different 
variables. The first acts on space variables, while the second acts only on spin 
variables. Hence the two operators commute with each other. It follows 
directly from this that the components of the total angular momentum 
operator satisfy the same commutation rules as the components of the 
orbital and intrinsic angular momenta. These commutation relations also 
follow from the connected between the total angular momentum operator 
and the rotation operator (see below) 


i) =i 


Di —Jaly Py (62.2) 
Be HP =f 
and also 
IP == Os 
ii? -ih =0, (62.3) 


Aa — ji, =, 


§62 THE TOTAL ANGULAR MOMENTUM 249 


It follows from the relations (62.2) (see §51) that the eigenvalues of the 
operator j have the form 


j2=n2y+l). (62.4) 


The value of the quantum number j determines the value of the total 
angular momentum. According to the rules of addition of angular momenta 
in quantum mechanics (52.1) it follows that for a given / and s the number j 
runs over the sequence of values 


j=\l-sl, W-st I; 7+s—1, Is. 


The quantum number j takes on integer values if the spin has integer 
values, and half-integer values if the spin is half-integer. 

We shall show that for a particle moving in free space or in a centrally 
symmetric field it is the total angular momentum which is the constant of the 
motion. For proof we introduce the rotation operator R taking into account 
the change in the spin coordinates as well the spatial coordinates of the wave 
function. We consider a rotation of the coordinate system about the z-axis 
through a small angle 5y. Proceeding from the results obtained in §30 and 
§61, it is easy to determine the change in the total wave function under such 
a rotation 


v =R,¥ J TW i [i$ o06,42)]¥ =(1+ hav )v : 


Of course, an analogous relation also holds for a rotation about any other 
axis. Consequently we see that it is simply the total angular momentum 
operator which is connected with the rotation operator. But the operation of 
rotation, by virtue of the isotropy of space, must not change the Hamiltonian 
of a closed system (or a system in a centrally symmetric field). Mathematically 
this is shown by the fact that the operator R, and, consequently, the operator 
iE will commute with the Hamiltonian A of the particle. Thus the total 
angular momentum conservation law is a consequence of the isotropy of 
space. For the intrinsic and orbital angular momenta separately the conserva- 
tion laws hold only approximately, to the extent that we neglect the spin— 
orbit interaction. 

If we have a system of non-interacting particles, then the total angular 
momentum of the entire system J is made up of the angular momenta J, of 
the individual particles according to the rules of addition of angular momenta 
in quantum mechanics 


j=20j,. (62.5) 











250 SPIN AND IDENTITY OF PARTICLES Ch. 8 


One can also introduce the total orbital angular momentum Operator L 


Lai a (62.6) 
k 


and the total spin operator of the system Ss 


SDM (62.7) 
k 


Since ie =I + Sk, then, obviously, we have 
J=L+S. (62.8) 


The operators referring to different particles commute with each other 
because they act on different variables. Hence for the operators of the com- 
ponents of the total angular momenta J, L, S there are the same commuta- 
tion relations as for the operators referring to individual particles. For 
example, for the operators Jes Uy of the components of the total angular 


momentum we have 


Joby -JyJy = D Otxdiy~liyhkx) = 
ki 
F 2 Gkxlkyİkylkx) = in 2 Íkz F in, 
k k 


An analogous result is also obtained for other projections of the operator f, 
as well as for the operators Land S. 

For given eigenvalues of the operators referring to individual particles the 
eigenvalues of the operators J2, L2, S? and of the operators of the com- 
ponents of the angular momenta are determined according to the rules of 
addition of angular momenta in quantum mechanics. The total angular mo- 
mentum J is conserved quantity for a closed system:of particles. 


§63. The Pauli equation. The probability current density vector 


In ch. 13, devoted to relativistic quantum mechanics, we shall establish 
the exact relativistic equation of quantum mechanics and show that the 
Schrödinger equation is obtained from it in the limit when v/c > 0. 

In this case it turns out that if terms of higher orders of small quantities 
are taken into account, then additional terms will arise in the Hamiltonian, 
describing a number of important properties of quantum systems. 


§63 THE PAULI EQUATION 251 


In particular, the existence of spin, as well as the fact that the electron has 
a magnetic moment, follows from the relativistic wave equation by expand- 
ing in powers of v/c and preserving terms of the first order of small quantities. 
Deferring the proof of this statement to §118, we introduce the intrinsic 
magnetic moment operator in correspondence with (59.4) by the relation 


= 


a. 24 
BSS ne? (63.1) 


Then the Hamiltonian for an electron in an electromagnetic field assumes the 
form 


Â- 3-(6-£A) + ua -G= 
Ea Ay + UG) -E 6-90), (63.2) 


where X is the magnetic field strength. Since the Hamiltonian depends on 
the spin, the wave function of the electron also depends on the spin variable, 
ie. Y = W(x,),z,1,5,). The equation for the wave function Y(x,y,z,65,) in a 
magnetic field, first introduced by Pauli and called the Pauli equation, is of 
the form 
ow INe 2 I PPE 
A — a (CH + 
i ðt 2m (6 c A) Y mc re ON i W o (635) 
Let us find the probability current density vector. For this we write the 
equation for the function yt 
Mat i fa. € 2 Cl aon 
== = Woe d | å 
Eon ð 2m (oe A) VT ine UGE A FU (2) 
We multiply (63.3) from the left by wT and (63.3’) from the right by y 
and subtract one from the other. 
Taking into account that p»A— A-p=(h/i)VA, we obtain after simple 
calculation 


ecu =- 2 vuvytyy —vtvyl + 
+ <“vAyty) + (Wt@-9ov-(s-90)tY] . (63.4) 


It follows from rule (45.20) that 





252 SPIN AND IDENTITY OF PARTICLES Ch. 8 


(EWV = VETH) (63.5) 


and, furthermore, St =§ by virtue of the hermitian property of the spin 
operator. Thus the term in square brackets reduces to zero and we obtain an 
expression for the probability current density vector. (It should be borne in 
mind that it contains two-component wave functions). Multiplying it by the 
charge e, we obtain the electric current density vector 


._ ihe e2 
j= 5m (VYY -Yi VY) -a AYY (63.6) 


The expression (63.4) determines the probability current density vector j with 
an accuracy to within V X B, where B is an arbitrary vector. It can be shown 
that B=(e/m)Wisy, so that the total expression for the electric current den- 


sity will have the form* 
3 ihe e2? e a 
İn =- 5 WIVY (Vt) —— Aty) + ZV X (WtSy). (63.7) 


Let us consider the case where the particle moves in a constant magnetic 
field X and the electric field is absent. Then the vector potential A can be 
chosen in the form 


A=3(HXr) . (63.8) 


We transform the Pauli eq. (63.3), taking into account that the following rela- 
tion holds for the vector potential (63.8) 


p:A=A:p. 
Furthermore we assume that the magnetic field # is relatively weak, and in 


correspondence with this we drop terms in the Pauli equation which are 
quadratic in Æ. We then have 


Oe e RAT pane (2: 
ito mY a XXD PY -yg EPO - 
Since 

(XXr) -P= H-(eXp)= #1, 


where Îis the orbital angular momentum, then 


* See L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965). 


§63 THE PAULI EQUATION 253 
oy p? a i A 
= — (py — (py 3.9 
in r = pe Y Br GOV — (he BOW . (63.9) 
It is natural to call the operator fi;, 


A Ces 


= l 
BI 2mc ” 





the orbital magnetic moment operator (as distinct from Ê,- the spin magnetic 
moment operator). 

We see that the ratio of the orbital magnetic moment Hy to the orbital 
angular momentum I is, as in classical physics, equal to e/2mec. For spin 
moments this ratio is larger by a factor of two. 

Eqs. (63.3) and (63.9) are naturally generalized to the case of a system of 
particles. Thus for a system of particles (of charge e and mass 7) placed in a 
nasty magnetic field X the Pauli pad (63.9) has the form 


Che m BEY — Sas 


ne “Im 


(Ly — — (8-90) + Vind » (63.10) 


2mc 


where L= Belk is the operator of the total orbital angular momentum of the 
system, S= DS, is the operator of the total spin angular momentum of the 
system (the summation is carried out over all particles of the system), and 
Uint is the potential energy of interaction of the particles with each other. 

The term which is quadratic in the magnetic field is also easily taken into 
account. In this case the Pauli equation has the form 


moe 


ðt s 2 [HXr,.]? +Uiny) Y. 





~2 ee ONE 
D Be ~ yg (L428) H+ 


8mc2 
(63.11) 


We see that in certain cases (see §75) the quadratic term plays an im- 
portant role. 

If the magnetic field is absent and the potential energy of interaction 
Un, does not depend on the spin of the particles, for example in the case of 
the Coulomb interaction of charged particles, then the Hamiltonian of the 
system does not contain any spin variables. In this case the wave function can 
be written in the form of a product of the coordinate wave function and the 
spin wave function. The particles can be in an arbitrary spin state, while the 
coordinate function satisfies the ordinary Schrödinger equation. 





254 SPIN AND IDENTITY OF PARTICLES Ch. 8 


§64. The identity of particles. The principle of identity of particles. Symme- 
tric and antisymmetric states 


We now pass on to the construction of the wave function of a system of 
particles of one kind, for example a system of electrons, or protons, or 
photons. 

In such systems new important specific properties having no analogue in 
classical mechanics appear. These properties will become clear from the com- 
parison of the process of collision of two particles — macroscopic and micro- 
scopic. 

In classical mechanics the properties of every particle are characterized by 
one quantity — its mass. If the masses of two particles are the same, then the 
particles can be assumed to be the same. The state of each of the particles at 
the instant of time ż = 0 is defined by initial conditions. 

Moving in definite trajectories the particles collide elastically at a certain 
point of space and diverge along separate trajectories. 

If the initial conditions are given, then the trajectories of each particle are 
completely defined and one can follow the motion of each particle. Hence 
particles in classical mechanics, even if they are the same, conserve their 
individuality. One can always establish which one of the colliding particles 
one has at a given point of space. 

In the case of two microparticles the situation in a collision is quite dif- 
ferent. At the instant of collision let the two particles be at definite points of 
space. According to the uncertainty relation, their momenta do not have 
sharp values. After the collision we can fix ‘the trajectories’ of the particles, 
for example two tracks in a cloud chamber. However, it is clear that if the 
two colliding particles are of the same kind, for example two electrons or two 
protons, then it is in principle impossible to establish which one of these 
particles is associated with a given track. 

As a second example let us consider a system consisting of two hydrogen 
atoms. 

If the atoms are at sufficiently large distances from each other, so that the 
electron clouds do not overlap, then each of the electrons is in effect 
localized by its nucleus. As the atoms approach, an overlapping of the elec- 
tron clouds occurs. This means that in the region of overlap there is a certain 
probability of finding both electrons. 

As a result of a measurement let an electron be observed in this region. It 
is clear that there is no way which would make it possible to establish which 
one of the electrons, that which earlier belonged to nucleus no. 1 or that 
which earlier belonged to nucleus no. 2, is observed. 





ee 


§64 IDENTITY OF PARTICLES 255 


The examples given above show that the ‘sameness’ of quantum particles 
has a much more profound nature than the ‘sameness’ of classical particles. 
Quantum particles are not simply the same, but are completély identical. 

If we were able to change the initial state of the system by replacing each 
electron by the other, no physical changes would occur in the system, and 
this replacement could not be observed by any physical experiment. 

It should also be stressed that in the examples given we have somewhat 
idealized the situation. If, for example, the two colliding particles have de- 
finite values of their momenta, they have no definite values of their coor- 
dinates. Hence one cannot indicate the region of collision. 

Thus we arrive at the principle of identity of particles, which can be 
formulated in the following way: in a system of identical particles only 
states which do not change under the exchange of any two identical particles 
can be realized. 

The identity of microparticles leads to very important and profound con- 
sequences. We recall that we have already encountered this property of micro- 
particles in statistical physics. We shall now discuss the effect of the identity 
of particles on their collective properties in a more consistent way. 

We consider a system consisting of N identical particles. The wave function 
of this system y will have the form 


WE 1 625-08 j9--08 KEN) - 


Here £; is understood to be the whole set of the coordinates and spin variables 
characterizing the ith particle. 

If two particles are exchanged, i.e. if the coordinates and the spin of the 
ith particle are replaced by the corresponding values for the kth particle and 
vice versa, then by virtue of the principle of identity the state of the system 
cannot change. Consequently, the wave function of the system can change 
only by an immaterial phase factor. 

After the exchange of two particles the wave function can be expressed in 
terms of the initial wave function by the relation 


VE piropo onL) = HOW (Ey Eg sb sbgaerbynl) » (64.1) 


where & is a real quantity. If the operation of exchange of the ith and kth 
particles is carried out once more, then the system will come back to its 
initial state. 

On the other hand, repeating operation (64.1), we can write that 


WE] Ea seek fe seek jose yet) = WE) Eott) = 
= COW (ED. sEp snk ee yet) » 





256 SPIN AND IDENTITY OF PARTICLES Ch. 8 


Hence it follows that e2#* = | and 
ela =+]. 


Thus when two identical particles are exchanged the wave function of the 
system can either remain unchanged, or change its sign. Wave tunctions of the 
first type are called symmetric, while those of the second type are called 
antisymmetric. We introduce the particle exchange operator Piz, which is 
important in what follows. By definition, when the operator P;, acts on the 
wave function of a system of particles W(&) ,£9,...,5;.--.&---.€y,t) it trans- 
forms it into the new function 


Pp WE] yeeesb e ob of) = WE] bese sSjo-- ENE) - (64.2) 
There corresponds to this transformation a transition of the ith particle into 
the state which had previously been occupied by the kth particle, and a 
transition of the kth particle into the state which had previously been occu- 
pied by the /th particle. 

Comparing (64.2) with (64.1) we see that the eigenvalues of the operator 
Py are equal to ei% = +]. The symmetric and antisymmetric functions are the 
eigenfunctions of the operator Pp corresponding to the eigenvalues +1 and 
—1 respectively. 

We shall show by means of the exchange operator that the properties of 
symmetry are conserved in time. This means that if at the initial instant of 
time the system was in either the symmetric or antisymmetric state, then no 
subsequent action whatever will change the character of its symmetry. In 
other words, the system will always remain either in the symmetric state or 
in the antisymmetric state. For the proof of this statement it is necessary to 
show that the operator Py commutes with the Hamiltonian operator. The 
exchange of two identical particles corresponds only to an exchange of terms 
in the sum forming the Hamiltonian of the system. This is easily seen in the 
example of a system consisting of two identical particles. In this case the 
Hamiltonian can be written in the form 


A2 2 ne 


= Sa 1” 2m 


V5 + UE t) + Unt) + Uy (by £0) - (64.3) 

Here U,5(&,,£>,¢) is the energy of interaction of the particles and U corre- 
sponds to the interaction with the external field and has, obviously, the same 
form for the two identical particles. When the particles are exchanged, we 
have for the new Hamiltonian 


ERA 2 R2 -2 
lela — om Vo mV! + U(Ep,t) + U(E t) + Uy 2(Er,£ 1,0) - (64.4) 


SYSTEMS OF FERMIONS AND BOSONS 257 


un 
an 
Ur 


It is clear that this is the same Hamiltonian as before the exchange. The 
result obtained is easily applied to the case of a system of N particles. We see 
that the exchange of particles does not change the Hamiltonian. Hence we 
obtain 

HP, -Pah =o. (64.5) 
Consequently, the symmetry properties of the system are a constant of mo- 
tion and are conserved in time. 

Thus it is natural to think that the symmetry is determined by the proper- 
ties of the elementary particles which make up the system. Pauli managed to 
show that particles having an integer spin are described by symmetric func- 
tions, whereas particles having a half-integer spin are described by antisym- 
metric functions. The first particles are called Bose—Einstein particles, or 
bosons, while the latter are called Fermi—Dirac particles, or fermions. Ex- 
amples of the first group of particles are light quanta (see ch. 12) and 7- 
mesons. The second group of particles includes neutrons, protons, positrons, 
electrons, neutrinos and muons (all having spin 4). 

To elucidate the problem of the symmetry properties of a system con- 
sisting of identical complex particles, it is necessary to determine the total 
spin of the complex particle. As in the case of elementary particles, when the 
complex particle has an integer spin the wave function is symmetric under 
the exchange of the complex particles, and is antisymmetric when the com- 
plex particle has a half-integer spin. 

As an example let us consider a system consisting of a-particles. For the 
determination of the symmetry properties of the wave function of the sys- 
tem it is necessary to calculate the total spin of the a-particle. The a-particle 
consists of two neutrons and two protons. Since the spins of the particles 
constituting it are equal to hi and the number of the particles is even, then 
the total spin of the a-particle is equal to an integer multiple of #. Thus the 
wave function of a system of a-particles is a symmetric wave function. 


§65. Wave functions for systems of fermions and bosons. The Pauli principle 


Let us consider a system consisting of N non-interacting identical particles. 
The Schrödinger equation for the stationary states of such a system has the 
form 


2 
2 [ n veut) | VW (Ey bays) = EVE 2N) - (65.1) 


i=1 


| 
| 
| 








258 SPIN AND IDENTITY OF PARTICLES Ch. 8 


In §14 it was shown that the solution of this equation is the function 


Y= Vk (EV xy E2) - Vay (Ew) - (65.2) 
Here ky, kz, ... are the quantum numbers of the states available to the 
particles. Each k; represents the complete set of quantum numbers which 
characterize the state of an individual particle. The function Vki is the solu- 
tion of the Schrödinger equation for one particle 
h? 25 m 
=o Vi Ux (ED + UE)Y (ED) = Ep Wy (Ei) - 
However, function (65.2) does not satisfy the requirements of symmetry. In 
general it is neither a symmetric nor an antisymmetric function. Since eq. 
(65.1) is linear, then a superposition of solutions of the type (65.2) will also 
be a solution. To obtain a wave function possessing the required symmetry 
one has to take the corresponding superposition of wave functions. 
For simplicity we consider a system consisting of only two non-interacting 
particles. It is obvious that the functions 


W1(E).€) = W1 (E,W 2(Eo) > WEE) = WEY (Ed), 


where the subscripts 1 and 2 of the wave functions Y; (#1), W,(&) and 
W(E,), W2(E2) denote two different states of a particle. The wave functions 
Yi (&; £2), W2(&) £2) correspond to one and the same energy of the system. 
From these functions one can make up two symmetrized combinations corre- 
sponding to the same energy: 


Y= Cy [Y E Wo (Eo) + Wo (EY (Ea) 


Wa = C [Yi (E, W2(Er)—W2 (EY 1 (E2)] - 

The first wave function is symmetric with respect to exchange of the 
particles, while the second is antisymmetric. The constants C} and C, can be 
determined from the normalization condition. If the functions y,(&,) and 
W>(&>) are normalized to unity, while Y, (and y,) is normalized by the con- 
dition [1y,I2d V,dV, = ], a simple calculation gives in both cases 

1 
Ci = Cy =Y 


Hence the normalized and symmetrized function can be written in the form 
Vs = Fal EDW) + WEDE) > (65.3) 
Va = yi E262) -726V E] - (65.4) 


§65 SYSTEMS OF FERMIONS AND BOSONS 259 


It is now easy to generalize formulae (65.3) and (65.4) to the case of an 
arbitrary number of non-interacting particles. Namely, for a system described 
by symmetric functions one can write that 


N! = 
Y; = ee) Dy Wk EK. (E2) -- Vey (En) - (65.5) 


pP 


Here n, is the number of indices which have the same i-value. Thus the ng 
show the number of particles in a given y,-state. Evidently =n, = N. 

The wave function w,(é,) is normalized to unity, all functions Y¿(¢%) are 
orthogonal to each other. Therefore in the normalization condition Styl? dV 
only terms of the type |W, (£)|? dé contribute. Thus 


fiv2av= D fivy@r2 ag= 


1 
nina! ... 


Here N!/(n;!n3!...) is equal to the number of rearrangements of different 
indices 7;. Similarly for particles with antisymmetric functions 


WED Yi Ey) 
ere wE 
y, = (N | ka (E1) YkalEn) (65.6) 
Vey (Ev) = Vey(En) 


The symmetrized normalized wave functions Y, and Y, describe the state 
of systems of N non-interacting bosons and M non-interacting fermions res- 
pectively. 

Let us now consider the change in the wave function of the system when 
there is an interaction between the identical particles. We assume that the 
interaction depends on time. The exact wave function can be written in the 
form of one of the superpositions 


v= Del, v= Weak - 
i k 


The coefficients c; and c, represent the time-dependent amplitudes of the 
probability of the corresponding ith and kth symmetric and antisymmetric 
states. 

The interaction gives rise to transitions in the system. As follows directly 
from the symmetry conservation law described in the preceding section, the 

















260 SPIN AND IDENTITY OF PARTICLES Ch. 8 


system will go over into states with the same symmetry under any external 
action. 

Thus the wave function describing a system of interacting particles is 
expressed in terms of the wave functions of a system of non-interacting 
particles with definite symmetry. 

The wave functions (65.5) and (65.6) found above make it possible to 
obtain a number of very important results. è 

Let us first of all consider a system of fermions. We assume that two 
particles in the system are in one and the same quantum state, i.e. that 
k, =k3. This means that the two particles have the same complete set of 
quantum numbers, for example one and the same value of the quantum 
numbers n, l, m, s, for motion in a centrally symmetric field, or py, Py, Pz» 
s, for free motion with a definite momentum. 

Then in the determinant of (65.6) two rows turn out to be the same, and 
the wave function reduces identically to zero. This proves the following state- 
ment: In a system of identical fermions one cannot simultaneously have two 
or more particles in one and the same quantum state. This is the well-known 
Pauli principle, established by Pauli before the appearance of quantum 
mechanics on the basis of an analysis of experimental data. 

The Pauli principle is often conveniently formulated in terms of the quasi- 
classical approximation: ‘No more than one particle with given spin orienta- 
tion can be found in each cell of phase space of volume (27h)? 

As we have seen in statistical physics, the Pauli principle determines the 
statistical behaviour of systems made up of identical particles with half- 
integer spin. The Pauli principle is of no less importance in understanding the 
regularities of the structure of many-electron atoms and of complex nuclei to 
which the next chapter is devoted. 

For what follows we consider the following problem. Let a system consist 
of N identical particles (bosons). At a given instant of time each of the bosons 
is in one and the same state with the wave function Y(¢) which is normalized 


in the following way: 

[yi @veav=n. 
Let us determine the mean energy of the system in this state. We write the 
Hamiltonian of such a system of particles in the form 
N 
A= 2 AE) +4 LOW AEE). (65.7) 
j=1 iy 


where H, is. the energy operator of the ith boson, and W; is the operator 


§66 TWO PARTICLES WITH SPIN 1 261 


corresponding to the energy of interaction of the ith and jth bosons. The 
wave function of the system of bosons, normalized to unity, at this instant 
of time has the form 


WE} Ea,--Ey) =N EN yE DWE) WEN) - 


The mean energy of the system in this state is equal to 


H= f W Erto AVE t2» oEN)AV dV ~- Vy - 


Taking into account the identity of bosons and assuming V> 1, we obtain 
A=f V EDAWEDAV +4 f V EDV EW EE )VEE AV AV; . (65-8) 


If the particles do not interact, then W=0, and the mean value of the 
energy has the form 


A= f y EDÂ VEDAV; . (65.9) 


We shall need (65.8) and (65.9) for what follows. 


§66. The wave function of a system of two identical particles with spin 4 


Bearing in mind further applications, let us consider the wave function of a 
system consisting of two particles with spin 4, for example two electrons or 
two protons, in more detail. 

The total wave function W,,(r,,51,,%2,52,) depends on the spatial and spin 
coordinates of the two particles and is antisymmetric in these variables. As- 
suming that there is no external magnetic field and that the interaction be- 
tween the particles does not depend on their spins, we write the total wave 
function in the form of a product of wave functions which depend only on 
spatial variables and spin variables 


Vy (1181298 2,522) = PLT 2) Sy 2,822) - (66.1) 


We write the total spin function of the system y in the form of a product of 
the eigenfunctions of the operators corresponding to the square of the spin 
and its z-component for each of the particles, i.e. the functions Y: (1), y1(1), 
v2), _1 (2), where the subscript denotes the spin projection of the z-axis, 
and the number in parentheses corresponds to the particle. 

The function y can be written in the most general form as follows: 


| 
| 
| 








262 SPIN AND IDENTITY OF PARTICLES Ch. 8 


1,2) = c19; (Oy: (2) + c29: (1) 91 (2) + 
+ e391 (1): (2) + capa (ye: (2) , (66.2) 


vhere Cy, €2, C3 and c4 are arbitrary amplitudes. 

We determine the spin wave functions describing the states of the system 
with given total spin and given z-component of spin. 

Since spins are added according to the general rules of addition of angular 
momenta (see §52), the total spin of a system of two particles has two pos- 
sible values, S= 1 and S=0. Its zcomponent correspondingly has the values 
1, 0 and —1 for S= 1, and S, = 0 for S = 0 (in units of ñ). 

The functions y describing a state with given S and S, satisfy the equations 


S2o=ns(Stly, S.p=AS.o, (66.3) 
where Seis} + $5 is the total spin operator of the system. The coefficients 
C1, C2, €3 and cą in the spin function of eq. (66.2) must be chosen in a way 
such that both equations (66.3) for y are automatically satisfied. 


One can easily convince oneself by a direct check that the spin function of 
a system corresponding to all the conditions mentioned above can be written 


in the form 


vi =y), S=1,S,=1, 


o =z lee) +e], S=1,5,=0, (66.4) 
yy =91(194(2), S=1,S,=-1, 
p =z [oy Oa- e0], S=0,5S,=0, (66.5) 


where the superscript indicates the total spin of the two particles, and the 
subscript indicates the z-component of the spin. The expressions (66.4) and 
(66.5) follow from formula (52.3) and table 1 at the end of ch. 6 of coeffi- 
cients C for j} =}. 

This result can also be obtained from the relations (61.18). We note that 
the spin functions (66.4) do not change when the two particles are exchanged, 
i.e. under the exchange 1 > 2, 2> 1. Consequently, these functions are sym- 
metric in the spins of the particles. The spin function (66.5) changes sign 
under such an exchange and is antisymmetric in the spins. 

The spin functions (66.4) form a spin triplet. The whole set of the three 
components of a triplet is equivalent to the three-component spin function of 
a particle with spin one. The spin function (66.5) describing the state with 
spin zero forms a spin singlet. 


§66 TWO PARTICLES WITH SPIN + 263 


Let us determine the eigenvalues of the scalar product $} $, in the singlet 
and triplet states. We shall have to deal with this product in what follows. 
Since 


S? = (S1 +82)? =S? +$ 


NN 


then 
S} : S3 = 1(S2—s}-$3) y (66.6) 


5 a = a A a2. . 
Substituting the eigenvalues of the operators S2, §? and §3 into the right-hand 
g g p I 2 8 
side, we have 


(8) $2) oS = 3n2(S(S+1)-3)¢5 . (66.7) 
For the triplet state S = 1, and thus 

Gi Spy! = thy! . (66.8) 
Correspondingly for the singlet state S = 0 and 

(81 S2)p° = —3n7y0 . (66.9) 


We now consider the function of spatial variables $(r],r3). Since the 
total wave function (66.1) is antisymmetric, then the coordinate wave func- 
tion will be antisymmetric in the state S = 1 and symmetric in the state S = 0. 
If the particles do not interact and are in states v, and Yp, then the coor- 
dinate function has the form 


Ii 


] 
Paty 2) = Hy Yn Ym 2) — Yn Ynt2)I » Sails (Gas), 


DEE) = Fy Wyn E2) + YE Dnd, S=O. (6611) 


As was shown in §14, it is convenient in the general case to use the coor- 
dinates R= 3(r,+r2) and r=r}] — r3, describing respectively the motion of 
the centre of mass of the system and the relative motion of the particles 


P(r1:r2) = Wo(R)Y(r) . (66.12) 


Let us find the consequences of the requirement of symmetry or antisym- 
metry of the wave function (66.12). We note first of all that if the potential 
of interaction of the particles depends only on the distance between them, 
U(ir;—r21), then the orbital angular momentum associated with the relative 
motion of the particles is conserved in such a system (see §35). We now ex- 
change the coordinates of the particles; r} > r3, rz >r}. The radius vector 


mm se 














264 SPIN AND IDENTITY OF PARTICLES Ch. 8 


of the centre of mass R does not change under this exchange. Consequently, 
the wave function ¥o(R) does not change. The radius vector of the relative 
motion r changes sign, r>—r. As we have explained in §33, if the orbital 
angular momentum associated with the relative motion of the particles is 
given and is determined by the quantum number /, then the law of trans- 
formation of the function y(r) under the replacement of r by —r will be 


v(=r) = (=1}y(r) . (66.13) 


We see that in this case the coordinate wave function (66.12) transforms 
under the exchange of the particles r} > r3, rọ >r] according to the law 


Prr) > (1V2 ,r2)- (66.14) 
It follows immediately from (66.14) that if the particle is in the triplet 
state S = 1, then the quantum number / can take on only odd values. On the 


contrary, the number / can take on only even values if the particle is.in the 
single state S = 0. 


§67. Exchange interaction and the concept of the chemical and strong nuclear 
interactions 


The identity of quantum particles leads to a fundamental change in the 
concept of the interaction between particles. 

We dwell first of all on a simple example which allows us to understand 
the essence of these changes. We assume that two identical particles with half- 
integer spin do not interact with each other in the classical sense of the word. 
This means that there are no terms in the Hamiltonian of the system describ- 
ing an interaction between the particles. 

Let one of the particles be in a given cell of phase space with the linear 
dimension of the cell ~d. The obvious relation 


(Aq, Ap) ~ d3(Ap,)? ~R? , 


holds, where Aq, ~d is the uncertainty in the coordinate of the particle, and 
Ap, ~hi/d is the uncertainty in its momentum. 

According to the Pauli principle, the second particle cannot be in the same 
cell of phase space. Hence it must either be at a distance larger than d from 
the first particle, or have a momentum p3 such that |p2—p)| exceeds Ap,, 
i.e. |Py>—p,|>7/d. Only under this condition can it approach the first par- 
ticle to within a distance smaller than d and be in a different cell of phase 
space. 





Y 
(eN 
n 


§67 EXCHANGE INTERACTION: CHEMICAL, NUCLEAR 


We see that particles with parallel spins cannot approach each other unless 
they possess a sufficiently large difference of momentum. Such behaviour of 
the particles is equivalent to the appearance of a repulsive force between 
them. If the spins of the particles are antiparallel this reasoning is not valid, 
since the Pauli principle does not forbid such particles to be in the same cell 
of phase space. 

Thus from the Pauli principle, which imposes restrictions upon the states 
of particles, it follows that there exists an interaction between the particles 
which depends on the orientation of their spins. 

The interaction between bosons cannot be illustrated by such an obvious 
example. Nevertheless, it is clear that the requirement of the symmetrization 
of the wave function corresponds to a definite dependence of the energy of a 
system of particles on its total spin, i.e. it leads to an interaction between the 
particles. 

We now assume that there is a certain weak interaction between two 
particles with spin 3, described by the operator H'(r,), where r} is the 
distance between the particles. For clarity, and keeping in mind further 
applications, we assume that H’ represents the Coulomb repulsion of two 
charges A'(ry9) = e2/ry9- Then the mean energy of interaction in the first 
approximation is equal to 


ED =); f WEA vodV dV. (67.1) 


Here Wo is the normalized function of the unperturbed state, and the summa- 
tion is carried out over all values of the spin variables. 

Since in the zero order approximation particles are assumed to be non- 
interacting, then the spin wave functions and the coordinate wave functions 
are separable. The latter can be written in terms of the symmetrized or anti- 
symmetrized products (66.10) and (66.11). 

Substituting the value of the operator H' and the wave functions into 
(67.1), we have 


1 re? $ 
DL) ea eS 2 = 
Bs piles Vn, Yn, (2) E Yn (DVn (DPV dV = 


2f Wn, OYn DP y dV, + 
SS l ot Oe 


AD 
f ef Yn, DVi DVn, OYn 1) 
“i 





dV dV, 








266 SPIN AND IDENTITY OF PARTICLES Ch. 8 


where the numerals 1 and 2 denote respectively the coordinates of the first 
and sec. i electrons, and 7; is the distance between them. The signs + and 
— refer to tiie states of the particles which are respectively symmetric and 
antisymmetric with respect to exchange of the particles. In this formula the 
summation has been carried out over the spin variables, which give unity. 
Furthermore, we have made use of the obvious equality 


AON nO Z yn, Wy, (AVAN = 


=f Yn (Yn, (2 Evi Wig (DAV 47, à 


In this equality one integral goes over into the other when the integration 
indices] and 2 are interchanged. 
Introducing the notation 


C= fy OP É Wap OP arara = 

= fib, 0)? V ayay; , (67.2) 
A= f V5, OORDE = Yn, Wna (D dV, dV = 

= [vi DOWO E a. (67.3) 


we write the energy of interaction (67.1) in the form 
EO =Car, (67.4) 


EW=C-A. (67.5) 


The sign t) denotes antiparallel spins (spin singlet), while the sign tt de- 
notes parallel spins (spin triplet). 

It is clear from the derivation that the general form of formulae (67.4) and 
(67.5) obtained above is not specific, that is, it does not refer only to the 
case of the Coulomb interaction, but could be obtained for any interaction 
which depends on the coordinates of the particles. 

It is interesting to compare this result with an analogous calculation for 
two different kinds of particles. We would then write for the non-symmetrized 
wave function Y = Yn, (Yn, (2) and would correspondingly obtain 


§67 EXCHANGE INTERACTION: CHEMICAL, NUCLEAR 267 
2 e2 
E = [Wy Plym? yz 182 - (67.6) 


Formula (67.6) has a simple meaning. It represents the mean value of the 
energy of the Coulomb repulsion of two particles. The position of one of the 
particles in the state 7, is characterized by the probability density Yn, (1)I?, 
while the position of the other particles is characterized by Vn, (2)I?. 

In formulae (67.4) and (67.5) the integral C has a structure analogous to 
(67.6) and is often called the Coulomb integral. However, strictly speaking, it 
does not allow such an interpretation, since it cannot be indicated which one 
of the identical particles is in the state n} and which one is in the state 5. 

The integral A, usually called the exchange integral (in German Austausch, 
which means exchange) has no classical analogues. Calculations carried out 
for actual systems show that the integrals C and A are always positive. It 
follows directly from formulae (67.4) and (67.5) that the correction to the 
mean energy, determined by the interaction of the particles, depends on the 
orientation of their spins. 

First of all we stress that it would be incorrect to assume that the inter- 
action is made up of two parts, a classical part and an exchange part, as is 
often done for the sake of clarity. The contribution to the energy determined 
by the Coulomb integral C is called the classical part of the interaction, while 
the corresponding contribution of the exchange integral A is called the ex- 
change part. In reality it is impossible to divide the interaction into the two 
parts, since the quantity A itself does not allow a classical interpretation. 

The most characteristic part of the exchange interaction is expressed by 
the integral A (67.3). This integral can be treated as the matrix element corre- 
sponding to the transition of the first particle from the state 73 into the state 
n,, and to the transition of the second particle from the state 7, to the state 
ny. Indeed, we introduce the operator P} defined by formula (64.2), which 
carries out the exchange of particles, so that 


Pion, Wn, (2) = Yn, (DY) 2 
P12¥ ny Wn, (2) > Yn, DY, 2) . 


Thus the operator Po is the exchange operator of the first and second par- 
ticles. By means of this operator the integral A can be written as 


A= S Un, Dn, OH Yn OYn (1) dV dV = 


= fwi OYE, Qi 2P Yn, Wn, (2) dV dV - (67.7) 











268 SPIN AND IDENTITY OF PARTICLES Ch. 8 


Thus the exchange interaction corresponds to the replacement of the oper- 
ator Hj by the operator H,>P 5. The total energy of interaction can be 
written by means of the exchange operator in the form 

E'=C+A = fur, DOD (Aya oP) 2)Y pn, Wn, (2) dV dV. (67.7') 

We see that the identity of quantum-mechanical particles essentially 
changes their interaction. If particles of different kinds possess an arbitrary 
interaction characterized by the operator Hj, then in the case of identical 
particles the operator of this interaction must be replaced by H)5 + H)5P). 
This inference does not depend on the nature of the interaction, i.e. on the 
character of the operator lil. Thus the electrical interaction of two identical 
particles (for example, two positrons) differs from the electrical interaction 
of different particles (for example, a positron and a proton). 

Thus the fact that one has identical particles whose state is characterized 
by symmetrized wave functions leads to a very important general conse- 
quence: the state of the system turns out to be dependent on the total spin 
of the system. 

This fact is a quantitative expression of the qualitative considerations 
presented in the beginning of this section. 

The dependence of the energy of a system of particles on the total spin is 
equivalent to the statement of the existence of an interaction between the 
particles. This interaction is called the exchange interaction. 

The exchange interaction has a specific quantum character. This follows 
formally from the fact that in the classical limit the spin of the system reduces 
to zero (see §60). Hence in the transition to the classical limit any difference 
between states with different spin, in particular the difference in their 
energies, vanishes. 

It should be stressed that although up to now we have dealt with particles 
with half-integer spin, the quantitative conclusion is equally applicable to 
particles with integer spin, i.e. bosons. In a system of two bosons having spin 
zero not all states obtained as a result of the formal solution of the corre- 
sponding Schrodinger equation are realized. Only states for which the wave 
function is symmetric in the particles correspond to physical states of a sys- 
tem and to definite values of its energy. In the case of two bosons with spin 1 
the energy of the system also turns out to be dependent on the total spin. 

The results obtained for a system consisting of two particles, fermions or 
bosons, are directly applicable to the general case of a system with an 
arbitrary number of identical particles. 

Coming back to the example of the interaction of two electrons, we shall 





§67 EXCHANGE INTERACTION: CHEMICAL, NUCLEAR 269 


show that exchange forces allow the following obvious, although not rigorous, 
interpretation. We assume that at the instant of time ¢=0 the first electron 
was in the state n} and the second in the state 7. It should be stressed once 
more that in reality such a formulation refers to the instant of time ¢ = 0 and 
that the reasoning to come serves only to clarify the effect of the exchange 
interaction. Then the initial wave function has the form 





(0) = Js [y (0) +y (=0)] = Yp (Ly, (2) - 
V2 1 2 


The states described by the symmetric Y, and antisymmetric Y, wave func- 
tions are stationary states with energies respectively 


PSE ECHAS 
BE ti CA, 


Hence the dependence of the wave functions Y, and y, on time is given by 
the formulae 


V, = v (0) e(-I/fi)(E+C+A Dt ; 
Va = (0) e (-i/ft) (E+C—A )t 3 


The total wave function (z) for t>0 is their superposition and, conse- 
quently, does not describe a stationary state. 


(1) = Fi WDY) = 


=F {14 n, DY n, C) + Vn, Yn, (2)] eAMEC*A 4 
+ [Yn Dn, 2) — Ving Dn, O eM EtC-Adty = (67.8) 
= {Vn (UY, (2) cosa! At —i, (Wy, (2) sini! Ar } eA, 


Formula (67.8) shows that if at the instant of time t= 0 electron | was in the 
state nį and electron 2 in the state 5, then after the lapse of a time interval 


_ 7h 
T=] (67.9) 


the electrons exchange states. The wave function 
iY n WY, 2) e(-i/h)(E+O)t 


corresponds to the first electron being in the state ny and the second electron 








"W 


270 . SPIN AND IDENTITY OF PARTICLES Ch. 8 


in the state n}. After the lapse of a time inverval 27 they come back into the 
initial states and so on. Thus the electrons exchange states with a period 7. 

Such an exchange of states is often presented in a concrete way as follows. 
One of the electrons of a system, for example an atom, is emitted and is then 
absorbed by another atom. The latter, in its turn, emits an electron which is 
absorbed by the first atom. In the process of ‘emission’ and ‘capture’ of the 
electrons a change in the momentum of the corresponding atoms occurs. The 
change in the momentum of the atoms means that a certain interaction be- 
tween them exists. This schematic and obvious consideration of the exchange 
interaction justifies the term ‘exchange’. However, it should not be taken 
literally. 

This is seen particularly clearly from the following reasoning. Let the states 
n, and n, correspond to the bound states of the electrons in two atoms. If we 
tried to understand the process described above literally, in the classical sense, 
then a contradiction would arise. Indeed, the electrons would not manage to 
exchange states or be ‘emitted’ and ‘captured’ by the atoms, because for this 
they would need to obtain from outside a certain energy in excess of their 
binding energy in the atoms. In reality each of the two atoms between which 
there is an exchange interaction is not in a state with definite energy. The 
uncertainty in the energy of the system AF is of the order of magnitude of 
AE ~ A. It makes no sense to speak of the constancy of the energy during the 
time interval 7, the exchange time which is in order of magnitude equal to 
AEAt ~ At ~h. During the time 7 the system is not in a state with definite 
energy and momentum. The two electrons are in a state with the wave func- 
tion (z). 

In this connection it is clear that it would be inadmissible to indicate the 
direction of the momentum of recoil of the atoms in the ‘transfer’ of the 
electrons and to try, proceeding from this, to determine the sign of the 
energy of interaction. Thus when speaking of the exchange of particles it 
should be kept in mind that this exchange has a virtual and not a real 
character. The word virtual means that only the initial and final states of the 
system have a direct meaning. 

In order that the exchange integral A may have values different from zero 
the wave functions Yn and Ym must overlap sufficiently, i.e. must both be 
different from zero in one and the same region of space. If, on the contrary, 
the wave functions y,, and Yn are different from zero only in different 
regions of space, then the exchange integral reduces to zero. If, in particular, 
Wn, and Yn, are the wave functions describing the bound states of electrons 
in different atoms, then the exchange interaction is possible only when the 
atoms are in direct contact. Further, let the wave function correspond to two 





§67 EXCHANGE INTERACTION: CHEMICAL, NUCLEAR 271 


bound states in an atom. For example, nį is the normal state, and n3 is one 
of the excited states. Then the value of the exchange integral decreases ex- 
tremely rapidly in the transition to higher excited states, when the states 7, 
and n, possess substantially different energies. Finally, when one speaks of 
the interaction of free particles which are described by plane waves, then the 
exchange integral differs from zero only for particles having similar values 
of the momenta. If, for example, the momenta of the particles differ appre- 
ciably, and the energy of interaction changes relatively slowly with coor- 
dinates, then A under the integral contains the product of a smoothly varying 
function and a rapidly oscillating function. The entire integral is then small. 
Thus an appreciable exchange interaction can take place only for identical 
particles which are in similar states, i.e. which are localized in a small region 
of space or have similar values of the energy and momentum. 

The following important consequence results from this property of the 
exchange interaction; the exchange interaction possesses the property of 
saturation, so that in a system of a large number WN of identical particles the 
total energy of the exchange interaction is proportional to the number of 
particles X. Indeed, two particles connected by the exchange interaction, for 
example two electrons with antiparallel spins, cannot themselves interact with 
a third particle. 

If the ordinary energy of a pair interaction is proportional to the number 
of pairs, i.e. to į N(N—1), then not all pairs are involved in the exchange inter- 
action but only those which contain particles in ‘similar’ states (in the sense 
specified above). Hence the total number of particles connected by the ex- 
change interaction is equal to the number of pairs made up of particles in 
similar states. It is obvious that this number of pairs is equal to 4N. 

In conclusion we point out the following fact. The derivation of the 
formulae for the energy of interaction was carried out with the assumption 
that the interaction operator does not contain any quantities which depend 
on the spin of the particles. However, one can also arrive at the same results 
in the case where the interaction operator contains spin operators. 

The exchange interaction between identical particles plays a very important 
role in nature. 

It is sufficient to point out that an exchange character is possessed by 
the forces to which homopolar chemical binding is due, the interaction which 
is responsible for the formation of crystals, the phenomenon of ferromag- 
netism and, finally, the interaction between particles in atomic nuclei, i.e. 
nuclear forces. We shall come back to the problem of chemical binding in 
§79, and in the meantime we shall dwell briefly on the problem of nuclear 
forces. 


s 


] 
1 








272 SPIN AND IDENTITY OF PARTICLES Ch. 8 


Up to now it has not been possible to construct a consistent theory of 
nuclear forces. The development of this theory is one of the major tasks of 
contemporary theoretical physics. At present the theory of nuclear forces has 
a semi-empirical character and is based on a number of experimental data. 
The totality of the available data has made it possible to establish the follow- 
ing properties of the nuclear interaction: 

1. Experiments on the scattering of neutrons on protons show that very 
strong attractive forces exist between nuclear particles at distances from 
1X10-!3 cm to 2X10713 cm. These forces decrease very rapidly with in- 
creasing distance and are not appreciable at distances larger than 2X107!3 
cm. At very small distances, smaller than 1X107!3 cm, the attraction is re- 
placed by a repulsion. 

2. Nuclear forces turn out to be independent of the charge of the particles, 
i.e. the nuclear forces acting between two protons, a neutron and a proton 
and two neutrons are the same. The charge independence of nuclear forces 
follows from direct experiments on the scattering of fast neutrons and pro- 
tons on protons, as well as from an.analysis of the properties of the so-called 
mirror nuclei. Mirror nuclei are nuclei differing from each other by the inter- 
change of neutrons and protons (nuclei with atomic numbers Z and A — Z, 
where A is the mass number, are mirror nuclei). 

The identity of neutrons and protons in nuclear interactions points to a 
profound symmetry existing between these particles. The inequality of the 
masses and the presence of an electric charge on the proton are facts of re- 
latively minor importance. Hence, according to the modern point of view, 
the proton and neutron should be considered as different charge states of one 
particle — the nucleon. 

The nucleon has spin å and in a given charge state obeys the Pauli exclusion 
principle. The nuclear interaction between nucleons is called the strong inter- 
action (see §112 and § 130). 

3. The nucleon can be in two different charge states, the proton state and 
the neutron state, between which transitions are possible. 

In free motion the proton state, which has a smaller mass and energy, is 
more stable. Hence the free neutron decays according to the scheme 


n>ptetp, 


where p is the antineutrino. 
In atomic nuclei, where there is a nuclear interaction between the particles, 


the transformation of neutrons and protons into each other occurs (see 


below). 
4. The presence of a charge on the proton entails two consequences: (1) 





§67 EXCHANGE INTERACTION: CHEMICAL, NUCLEAR 273 


the proton state and the neutron state are different states of the nucleon; (2) 
besides the nuclear interaction there are the forces of Coulomb repulsion be- 
tween two protons. These forces become important in the case of heavy nu- 
clei, determining their instability. 

5. The nuclear interaction depends not only on the distance but also on 
the mutual orientation of the spins of the interacting particles, as well as on 
the orientation of the spins with respect to the axis joining the two nucleons. 

The dependence of the nuclear interaction on the orientation of the spins 
follows directly from experiments on the scattering of very slow neutrons on 
orthohydrogen and parahydrogen. 

The existence of the dependence on the orientation of the spins with 
respect to the axis follows from an analysis of the properties of the deuteron, 
in particular from its possession:of a quadrupole moment. 

6. The nuclear interaction has exchange character. This fundamental con- 
clusion follows, first of all, from the very fact of the stability of nuclei. 

If the nuclear (strong) interaction depended only on the distance between 
the particles, then the potential energy of a system with mass number A 
would be proportional to A? — to the number of pairs which attract each 
other. However, the kinetic energy of a gas of Fermi particles confined in a 
given volume increases with the number of particles, according to (79.4) of 
Part III, as 49/3. 

Thus for a sufficiently large mass number the potential energy would turn 
out to be larger than the kinetic energy, the nucleus would have to contract 
and the particles would have to merge with each other. The volume of the 
nucleus would be a constant quantity which does not depend on A, and its 
binding energy would be proportional to A2. In reality the data on scattering 
show that the volume of the nucleus increases in proportion to A, and that the 
binding energy is also proportional to A. This means that nuclear forces 
possess the property of saturation. Saturation, as we have seen above, is a 
characteristic property of exchange forces. 

We shall dwell in somewhat more detail on the description of the modern 
concepts of the nature of nuclear forces. 

The following simple picture of nuclear forces results from the assumption 
that the proton and neutron are different states of one particle and from the 
exchange character of-the nuclear interaction: between two nucleons at very 
small distances there is a virtual exchange of a certain particle which is called 
the ‘carrier’ of the interaction. This exchange is in principle similar to the 
virtual exchange of the electron which was considered in detail above in the 
example of the exchange interaction. 





274 SPIN AND IDENTITY OF PARTICLES Ch. 8 


It turned out (see below) that the particle responsible for the nucleon— 
nucleon interaction is the 7-meson. 
Three types of exchange are possible: 


p=nta', 
npt, 
pæp+tn?, 
n=nt+7°, 


In the first two virtual processes the nucleon goes over from the proton state 
into the neutron state and vice versa; in the last virtual process the charge 
state of the nucleon does not change. The process of exchange of the charged 
meson can clearly be interpreted in the same way as the exchange of elec- 
trons: each of the nucleons spends a part of its time in the charged state and 
a part of its time in the neutral state. The exchange of m-mesons gives rise 
to the attraction between nucleons. We have emphasized the virtual character 
of the exchange, since an energy not less than m,,c?, where m, is the mass of 
the m-meson, would be necessary for the production of real m-mesons. 

All 1-mesons, positive, negative and neutral, should be considered as dif- 
ferent charge states of one particle. 

Further, it turns out that the masses of m-mesons cannot be arbitrary. 
They can be connected with the range of nuclear forces. Since nuclear forces 
do not depend on the electric charge and have a purely quantum nature, the 
range of the forces can depend only on the mass of the carrier particles m, 
and the universal constants fi and c. 

From the above three quantities one can construct only one constant with 
the dimensionality of length; the Compton wavelength of the meson 


pepe (67.10) 


M,C 
On the basis of the following reasoning one can ascribe an obvious meaning 
to expression (67.10). In the virtual exchange of the meson the energy of 
each of the nucleons must have an uncertainty AE ~ m,c?. The exchange 
time 7 must have aa order of magnitude of r~ht/m_c?. If it is assumed that 
the meson moves with a velocity ~c, then during the time 7 it traverses the 


path 


h 
RECT 
m,,¢ 


§67 EXCHANGE INTERACTION: CHEMICAL, NUCLEAR 275 


This distance is just the range of nuclear forces. Giving the range of nuclear 
forces, one can find the mass of the carrier particles 


Te ic ~ 300.) >- 


where Mej is the mass of the electron. This is of the same order of magnitude 
as the mass of the 7-meson. We note that 7-mesons were discovered experi- 
mentally after having been introduced in the theory as the hypothetical 
particles responsible for the strong nuclear interaction. The most convincing 
proof that m-mesons are the carriers of nuclear forces is the experimentally 
established extremely strong interaction of 7-mesons with nucleons. 

If the energy of a system of nucleons exceeds mace, then 7-mesons can 
really be produced. The appearance of 7-mesons was observed in collisions 
of fast nucleons (see §136) as well as in the action of y-rays on nucleons. The 
reaction 


agar sae o 


representing the elementary act of the nuclear photoeffect was observed. 
Later, in particular in §112, we shall come back to the problem of nuclear 
forces. 








Applications of Quantum Mechanics 
to Atomic and Nuclear Systems 


§68. The helium atom 


The hydrogen atom, which was considered in detail in ch. 4, is the simplest 
one-electron system. Passing to the study of many-electron systems it is 
natural to investigate, first of all, the properties of the helium atom, in which 
two electrons revolve around the nucleus. We assume that the nucleus has an 
infinitely large mass. Hence, assuming the nucleus to be at rest, we write the 
Hamiltonian of the system of two electrons in the form 


fiz > h2 02 2e? 2e? =) E 
-Zvi v- ar ie W=Ey. (68.1) 





Here rı and r, are the radius vectors of the first and second electrons, and 
Ij is the distance between them. The third and fourth terms of (68.1) ex- 
press the potential energy of the electrons in the field of the nucleus, and the 
last term expresses the energy of the Coulomb interaction between the elec- 
trons. 

It should be noted that a number of approximations are associated with. 
the representation of the Hamiltonian in such a form. Electrons possess a 
magnetic moment whose interaction is of a more complex character than the 
Coulomb interaction. Further, magnetic moments (the spin magnetic moment 
and the orbital magnetic moment) also interact with each other. However, we 


276 





§68 THE HELIUM ATOM 277 


shall not study these effects, which have the character of small corrections, in 
detail. 

Since the Hamiltonian of the system does not contain any spin operators, 
the solution of eq. (68.1) must be sought in the form of a product of two 
functions, one of which depends only on the coordinates and the other only 
on the spin 


Y = P(r) ,12)(Sz,,8z,) - (68.2) 


In §66 it was shown that the spin function of two electrons is symmetric 
with respect to the exchange of two particles if the total spin of the system 
is equal to one, and that it is antisymmetric if the total spin is equal to zero. 
Thus it is seen that the states of the helium atom are divided into two groups. 
States with zero spin are called parastates, and those with unity spin are 
called orthostates. If the Hamiltonian (68.1) described the system exactly, 
then the three orthostates differing in the z-component of the spin would 
have the same energy. However, the weak interaction between the spin mag- 
netic moment and the orbital magnetic moment removes the degeneracy and 
three close sub-levels arise. Thus the energy spectrum of helium consists of a 
set of singlet and triplet levels. 

From general considerations it is easy to establish the group of states to 
which the ground state of helium belongs. As is known, the ground state of 
helium is described by a wave function having no nodes (see §10). It is 
obvious that this function cannot be an antisymmetric coordinate function, 
since the latter reduces to zero for r} = r3. 

Indeed, if &(r;,r>) is an antisymmetric function of the two variables r, 
and r3, then it satisfies the relation 


@(r 1,19) = -Prr 1) - (68.3) 


For r} = r3 =r we have ®(r,r) = 0. 

Thus we see that in the normal state the wave function is symmetric in the 
coordinates and, consequently, is antisymmetric in the spins. 

The normal state of helium is a parastate. 

In eq. (68.1) the variables are not separable and it is impossible to obtain 
the exact solution. Therefore a number of approximate methods have been 
devised for its solution. The application of perturbation theory makes it 
possible to obtain the wave functions and, in a rather rough approximation, 
the energy of the ground state of helium. 

We shall assume that the interaction between the electrons is the perturba- 
tion in eq. (68.1). 

Then in the zero order approximation (68.1) can be written in the form 





278 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


A2 2 ÑZ 2 2? 22 3 

C= ee i= — Jig = Eod - (68.4) 
This equation can be solved by the method of separation of variables. For 
the normal state of the helium atom we have 


Po= Vi Yi (r2), Eg = 2F, , 
where £, and Y} denote respectively the energy and the wave function of 
the normal state of a hydrogen-like atom with charge Z=2. As we have 
pointed out before, the function ®g is symmetric in the coordinates of the 


electrons. 
The energy level df the ground state in the first approximation of perturba- 


tion theory is given by the formula 

E=E,+EO, 
where the quantity ÆC) in accordance with formula (53.12) is determined by 
the matrix element 


2 peA 
ED = f iy 1)? Wy 02)1 rp 182: (68.5) 


In calculating higher energy levels of the helium atom the solution of the 
unperturbed wave equation can be written in the form 


Bo = VA(T)Ym(lr2),  Eo=En tEn- (68.6) 
Here Y„ and Y, are the wave functions of the hydrogen-like atom in the nth 
and mth states respectively. (We shall not take the degeneracies in the orbital 
and magnetic quantum numbers further into account in obtaining a qualitative 


picture.) 
It is easily understood that the function 


Po = Yn(t2)Ymn (1) , Eo = En h Em (68.7) 

will also be a solution of the unperturbed equation. 
Thus a two-fold degeneracy arises. The two solutions (68.6) and (68.7) 
differ from each other by the exchange of electrons. For further calculations 
we shall need to have recourse to perturbation theory in the presence of 


degeneracy (see §54). 
The correction to the energy Eo in the first approximation is determined 


in this case from the condition that the determinant below be zero (see §54) 
H\,-E® Hp 


=0. 
Hy, Hy -ED 


§68 THE HELIUM ATOM 279 


The quantities H|}, H12, Hh; and Hy, represent the matrix elements 


2 
' 4 e 
Ayy =f VRE DYint) Yn@ Ym 2) dV,dV, , 


> 
TAA * ee ee 
An =f vin, Wl 2) ri Yin Yn 2) dV dV, d (68.8) 


2 
1 * * e* 
Hi2 = [VRE DhE) r12 Yn 2)YmO) dV\dV, > 


2 
J e 
Hy = [Yin itd ps Va 1)Y (2) dV dV . 


It is easily seen that H}; = H32, H12 =H,. Indeed, if r} is replaced by r3 
and r, by r} in the expression for the matrix element Hi, , then we obtain 
exactly the expression Hy. An analogous statement also holds for the matrix 
elements > and H},. Evaluating the determinant, we get 


(Hi -E — HY = 0. 

Hence we find two values for the correction to the energy: 
EP =H; tj» (68.9) 
EW =H -Hi2 (68.10) 


The expressions (68.9) and (68.10) are the same as the general formula 
found in §67. To the two values of the energy (68.9) and (68.10) there corre- 
spond two wave functions of the type Y = aY p (r1)Y m2) + bY, (12)Y, (Fy). 
The coefficients a and b are determined from the equations 


(Hi ~ED)a + Hb =0, H\a +H) — Eb = 0. 


For the value of E9 we obtain a= b, while in the case of ESD we have 
a= —b. Thus under the action of the perturbation the degeneracy is removed 
and we obtain two different states 


DP = a) [nm (Fo) + Yn 2m] » 
(68.11) 
DP = az (Vy EY mE) Yn EY mE) - 


where a, and a, are the normalization constants. 

As is seen from formulae (68.11), the states of) and 6© are respectively 
symmetric and antisymmetric in the coordinates of the electrons. In corre- 
spondence with the aforesaid, the function oy describes the state of the 


-d 


280 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


helium atom with spin zero, while the wave function oy) corresponds to the 
state with total spin equal to one. 

The wave function of corresponds to a parastate with zero spin and 
higher energy EW. Correspondingly, of) corresponds to an orthostate with 
unit spin and lower energy £4”. 

The matrix elements H, = Hp, as is seen from their definition, represent 
the Coulomb integral, while H} = H3; represent the exchange integral. All 
that was said in §67 applies to the system of two electrons in helium. For 
example, if electron 1 is in the ground state, while electron 2 is in an 
excited state, then after a lapse of time 7 = 7/i/2|H'}>| they exchange states. 

The relations (68.9) and (68.10) still do not give a complete picture of 
the levels of the atom. Indeed, in the calculation carried out above we have 
not taken into account the degeneracy in the quantum number / of the levels 
of hydrogen-like atoms. The interaction between the electrons removes this 
degeneracy, and the levels turn out to be dependent not only on the principal 
quantum numbers but also on the orbital angular momentum. 

We note that the method of calculation given above does not give high 
accuracy. The energy of the ground level of the helium atom obtained ac- 
cording to the above theory differs by about 20% from the value found ex- 
perimentally. This strong disagreement is due to the choice of the perturba- 
tion, which is not sufficiently small. 


§69. The variational principle 


We have already seen that the two-electron problem, the helium atom, 
cannot be solved accurately and calls for the use of approximate methods. 

This applies even more to complex atoms containing many electrons. 
Despite the complexity of many-electron atoms, effective approximate 
methods of solution allow one to get a very detailed idea of their properties. 
The effective approximate methods are to a large degree connected with the 
extremal properties of the Schrodinger equation. Namely, it turns out that 
the Schrödinger equation can be obtained from a variational principle. 

We introduce the functional 


J=fe*Heav, (69.1) 
where the restriction 
fervav=1 (69.2) 


is imposed upon the function y. In other respects y remains an arbitrary 


§69 THE VARIATIONAL PRINCIPLE 281 


complex function having the same dimensionality as the eigenfunctions of 
the operator H. The minimum value of the functional J under the condition 
(69.2) can be found by Lagrange’s method. In varying the complex function 
it is possible to vary y and y* independently. For concreteness, let us vary y*. 
The variation of y leads to the same result. 

Clearly we have 


Sfe*HedV + Eo5 y*ydV=0, 


where Eo is the Lagrange multiplier. Hence we have 


fse"(H-Eg)paV=0, (69.3) 
or in view of the arbitrariness of 6y", 
(H-E)p=0. (69.4) 


Thus, if p= Wo, where Wo is the normalized solution of the Schrodinger 
equation corresponding to the eigenvalue Æg of thé operator H, then the 
functional J is equal to 


Ivo) = [ VGAY 0 AV = Eg - (69.5) 


We shall show that £9 is the minimum eigenvalue of H, i.e. the energy of the 
ground state. Let y= Wo + Dc, W,- Then for J we find 


J= [ot crvi Aor Li cy, )AV= Eo + V lepl? Ep >Eg. (69.6) 


The wave functions of the excited stationary states y,, must satisfy not only 
condition (69.2) but also the condition of orthogonality 


[Yo¥n dV = 0. (69.7) 


They correspond to extrema but not the minimum of J(y,,). 

The variational properties of the Schrödinger equation are widely used for 
obtaining approximate solutions of it. Defining the form of a trial function 
on the basis of physical considerations or experimental data one seeks the 
minimum value of the integral J(y). 

Let us consider as an example the harmonic oscillator. Choosing as the 
trial function the normalized function y= (2a/m)1/4 ex" we have 


2a\3 h2 d? | mw?x? ħa , mw? 
sO (al ica teeter E te 


The condition of minimum gives a = mw/2h. Hence 





282 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


Jmin = £o =ghw 
and 


Ymin = Yo= (moo/hr)* exp (—mwx?/2h) . 


The appropriate choice of the trial function y led us to an accurate value of 
Eq and Wo. Had we chosen another trial function, then we would have 
obtained another, although similar, value Ep and Wo-A shortcoming of the 
variational method is the fact that it gives an unpredictable error. 

Other examples of the use of the variational principle will be given in the 
next section. 


§70. The self-consistent field method (Hartree—Fock method) 


For the calculation of many-electron systems wide use is made of the self- 
consistent field method which we have already encountered (see §41 of 
Part IV). The idea of the method (often called the Hartree—Fock method) is 
as follows. In the zero order approximation all the electrons are assumed to 
move independently of each other in the field of the nucleus. By means of 
the wave functions of the zero order approximation one finds the charge 
density and the mean electrostatic field produced by all the electrons. 

In the next approximation each of the electrons is assumed to move in the 
field of the nucleus and the field produced by all the other electrons. The 
solution of the Schrodinger equation in this field gives the wave function in 
the first order approximation. Introducing the correction into the charge 
distribution and field distribution and solving the Schrodinger equation in 
the new field, one can find the correction of the second order approximation 
and so on. 

To obtain the Schrödinger equation in the self-consistent field method we 
shall make use of the variational principle. To abbreviate the notation we 
shall carry out the calculations for the example of a two-electron system (the 
helium atom), confining ourselves to the calculation of the ground state. 
Therefore we shall not take into account the requirement of symmetrization 
of the wave function of the system of electrons. This will be done somewhat 
later. In the zero order approximation both electrons are described by iden- 
tical real wave functions y, and the wave function of the atom has the form 


W=Yio- (70.1) 


The variational principle reads 


§70 THE SELF-CONSISTENT FIELD METHOD 283 


6 | vA- Ex av =6 f vv -E yd = 





=f yy (Ĥ- Ey V2 dV =0. (10.2) 
Hence ý 
[¥xA- By. 4V=0. (70.3) 
Substituting the value of H trom (68.1) into (70.3), we obtain 
2 _ > 2e2 2 e2 
Ce aa ee y2 = = 
| Din ata Vas + fva5- av. Ye (ss) 
Here the additional term in the potential energy has the simple meaning 
GEA) 
Cr) = fx av en Vor: (70.5) 


where p(r3) is the charge density produced by the second electron. The same 
equation is obtained for 4. The total energy of the atom turns out not to be 
equal to twice the value of £} but is given by the formula 


PiP2 
E= 26, = fem dV dV . 


Indeed, by definition 


E={ w*Hwav= Jo [fy +i += — ary) ctr) | wav = 2B = Gr 
"12 


(70.6) 
Here 


E) =E,= [YÂ vary, 
where the operator Ay is equal to 
A. 2e2 
Hy SSS i= m ws 


The quantity C is equal to 


=_ P1P2 
C=f e he 





It is obvious that C represents the mean energy of the electrostatic inter- 
action between the electrons. To obtain the correct value of the energy Æ it is 





| 
| 
| 
f 
l 
| 
i 





284 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


necessary to subtract C from 2£,, since this quantity is involved in the 
Hamiltonian of each of the electrons. In the case of a system of N electrons 
an analogous derivation gives for the ith electron in the nth quantum state 


Wrgl? dV 
lr;— rl 


Sle 
2m 


The structure of the general equation does not differ from that of equation 
(70.4). The complexity of the Schrodinger equations in the self-consistent 
field approximation is associated with the fact that the equation for y; 
involves the wave functions of all the other electrons. Therefore, even in the 
simplest case of a two-electron system, one has to solve eq. (70.4) either by 
numerical or by approximate methods; for example the variational method. 
In this latter case it is natural to choose as the trial functions the hydrogen- 
like functions for a certain effective nuclear charge. The value of this charge 
is found from the condition of minimum of the integral (70.2). These cal- 
culations, as well as a summary of the numerical solutions, can be found in 
the book of Bethe and Salpeter*. 

So far we have not taken into account the symmetry of the wave function. 
It is clear, however, that from the theoretical point of view the symmetriza- 
tion of the wave function must be carried out from the very beginning of the 
calculation. For example, if no account is taken of the symmetry of the 
wave function no difference appears in the energy of orthohelium and para- 
helium. 

The self-consistent field method taking account of the requirements of 
symmetry of the wave function is called the Hartree—Fock method. In the 
simplest case of two electrons all the preceding calculations can easily be 
carried out for the symmetrized wave function 


VitUr) + Lee, f vn = Ep Vni- (70.7) 


KLD = Fz [V1 (1)¥2(2) V2 12) - 


Substituting this expression into (70.2) we have to vary the wave functions 
W, and Y, independently of each other. 

Then instead of (70.3) we obtain 
2 


Df far, Swar (VONË - EVVA) VDV) = 0. 
i=] (10.8) 


* H.A.Bethe and E.E.Salpeter, Quantum mechanics of one and two electron systems, 
Handbuch der Physik, vol. 35 (Springer, Berlin, 1957). 


§71 THE STATISTICAL MODEL OF THE ATOM 285 


In this case in (70.8) i= 1 for k= 2 and i= 2 for k= 1. In view of the arbi- 
trariness of ôy} and ôy, we arrive at two equations. On substituting the 
total operator H from (68.1), these equations assume the form 


ñ? y2 2e? 1 = 
ony. Mami naps Bata FD C20 W(t) — [H2 + C12] Y(t) = 0 

r12 

(70.9) 

h2 2e2 
-m V9 E-F +H + Cy | a(t) -lH + Cy2) V(X) = 0 

"12 

where 
2 

=f spay eat Be? 2e2 
JQ an, Hig = =fy; x Wine rp Ydy. 


By taking into account the symmetry of the wave function the number of 
unknown wave functions is doubled, and a system of simultaneous equations 
is obtained. The main difference between the Hartree—Fock equations and 
the Hartree equations consists in the appearance of exchange integrals, i.e. 
terms of the form Cj. 

In the general case of many-electron atoms the wave function of the sys- 
tem of electrons which is to be substituted into the equation of the varia- 
tional principle must be written in the form (65.6). We shall not give the 
cumbersome equations which are then obtained. Although in solving the 
Hartree—Fock equations numerically one has to carry out very laborious 
calculations, it is possible to find with a high degree of accuracy the energy 
of the ground and excited states, and the distribution of the charge and of 
the field for helium, as well as for a number of other atoms and ions. 
Naturally the number of numerical calculations necessary in integrating the 


Hartree—Fock equations increases rapidly with increasing number of electrons. 


§71. The statistical model of the atom 


For heavy atoms, when the calculation of the many-electron system ac- 
cording to the Hartree—Fock method becomes very time-consuming, a statis- 
tical method is widely adopted. 

Let a system of a large number of electrons move in a spherically sym- 
metric field y(r). By virtue of the Pauli principle a large fraction of these 
electrons will be in states with large quantum numbers. If the potential y(r) 








286 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


changes sufficiently slowly in space, then the electrons can be considered in 
the quasi-classical approximation. If, furthermore, the interaction between 
the electrons is sufficiently weak, then the whole set of electrons can be 
considered to be an ideal Fermi gas at absolute zero. 

In a degenerate Fermi gas (see §79 of Part III) the electrons occupy 
quantum states in pairs, so that there is a phase-space cell of volume (2nh)3 
per pair. In this case, all cells in momentum space with a momentum lying 
in the interval O<p < Pmax are filled. The value Pmax is easily expressed in 
terms of the electron gas density n (i.e. in terms of the mean number of 
electrons per unit volume). The number of electrons per unit volume with a 
given value of the momentum is evidently equal to 


dn =2 2 





p-dp. 


(27h)? 


Integrating from p = 0 to p = P max We have 


3 
pes. = (2nh)3n . (71.1) 


This formula allows the charge density to be expressed in terms of the 
momentum 


87e 3 


== Ban - 
32mh) ™™ 
On the other hand, Pmax can be related to the potential by means of the 
following simple reasoning. The energy of an electron bound in an atom, £, 
is always negative, i.e. 


(71.2) 


pe 
E=~, ter) <0. 


We assume that the potential y(r) reduces to zero outside the atom. Hence 
for the maximum momentum compatible with the requirement E = 0 we 
find 


Pmax ~ [-2mey(r)]? a 


Hence the electron charge density is connected with the potential by the 
relation 


a 8ne(—2me)? y? 


aa (71.4) 











§71 THE STATISTICAL MODEL OF THE ATOM 287 
In the self-consistent field approximation one can write for the potential of 
the electrostatic field y(r) the Poisson equation 

V2p=—4np , 


or, taking into account the spherical symmetry of the atom, 


1d2(ry)___ 32n7e(—2me)?y? _ 
r a? 3mh 
4e(—2me)? y? 
3mh? 





(71.5) 





The equation obtained is called the Thomas—Fermi equation. To obtain the 
distribution of the potential y(r) it is necessary to supplement this equation 
with boundary conditions. Let us first consider the case of neutral atoms. 
Then one of the boundary conditions is y > 0 as r > œ. The second condition 
follows from the requirement that near the nucleus, when its charge is not 
screened by electrons, the field be a purely Coulomb field, i.e. that 


A) as r>o. (71.6) 


To obtain the solution of eq. (71.5) with boundary conditions (71.3) and 
(71.6) it is convenient to pass to dimensionless quantities, defining them by 
the relations 


= AZ axle 
X=Ziej’ x57 


where d is a constant quantity with the dimensionality of length. For x we 
find the equation 


d?x _ lel (2m)? Zid? xè 
dx? 3mh3 x 
Setting d equal to 
g (5) L lo 
2\ 16) me? Zs 





0.88, (11.7) 
Zs 


where a is the radius of the Bohr orbit, we arrive at the equation 








déx _ xt 
Fe er (71.8) 





288 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


In this case it is obvious that 


x71 as x>0, 
(71.9) 
X70 as x>, 


The integration of eq. (71.8) with boundary conditions (71.9) has been 
carried out numerically. Since the boundary value problem does not depend 
on the atomic number, the integration of this system allows one to find the 
universal distribution of the dimensionless potential in an atom. 


xO) 





Fig. V.19 


The behaviour of the function x(x) for an atom is shown in fig. V.19 by 
the dotted line*. Since the function x(x) for x>% only reduces to zero 
asymptotically, the potential, and also the electron density, nowhere reduces 
to zero. This means that in the approximation considered a finite value of the 
atomic radius cannot be found. 

In fig. V.20 the curve of the radial electron density D= 4nr2p(r) for the 
argon atom according to Thomas—Fermi (solid line) is compared with the 
result of the Hartree—Fock method (dotted line). 

Fig. V.2ZO illustrates in an obvious way the merits and shortcomings of the 
Thomas—Fermi method. It does not give all the details of the behaviour of the 
electron density inside the atom, but it makes it possible to establish suffi- 
ciently accurately its general trend. 


* The tabulated values of the function x(x) can be found in the following books: 
L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 1965) and 
P.Gombas, Die statistische Theorie des Atoms und Ihre Anwendungen (Springer Verlag, 
Wien, 1949). 





§71 THE STATISTICAL MODEL OF THE ATOM 289 


In the outer parts of the atom, at a large distance from the nucleus, the 
electron density as calculated by the Thomas—Fermi method is overestimated. 

The fact that the Thomas—Fermi method gives poor results for the peri- 
pheral regions of the atom follows from the conditions of its applicability 
(see below). The numerical calculation of the behaviour of the electron 
density with distance from the nucleus shows that one half of the total elec- 
tron charge is contained in a sphere of radius R © 1.33aZ~1/3, 

Therefore, qualitatively, the quantity R can be considered to be the effec- 
tive radius of the atom. It decreases with increasing Z. 





Fig. V.20 


The total energy of all electrons in the atom is of the order of magnitude 
of the mean electrostatic energy of one electron Ze2/R ~ Z4/3e2/a multiplied 
by their total number Z, i.e. is of the order of e2Z7/3a. These mean values, 
as well as all quantities referring to the properties of the inner regions of 
atoms, for example the structure of X-ray levels, are in good agreement with 
experimental data. 

On the contrary, quantities which depend on the properties of the peri- 
pheral electrons, for example the ionization potentials of the atoms, cannot 
be determined satisfactorily by the Thomas—Fermi method. At the periphery 
of the atom the electron density is insufficiently large for the electrons to be 
considered as a degenerate electron gas. 

The main merit of the Thomas—Fermi method is its simplicity. As an ex- 
ample we present an imnortant result which also follows from calculations 
by the Hartree—Fock method, but which in that case requires very cumber- 
some calculations. The question is that of finding those values of the atomic 
number Z for which states with a given value of the orbital angular momen- 
tum begin to be occupied. If the electron: moves with angular momentum / 





290 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


in the self-consistent field y(r), then its effective potential energy can be 
given‘ by formula (35.18). In the quasi-classical approximation /(/+1) can be 
replaced by (/+4)?. We then have 
2 
h2 (144) 
Vere = elol) + 5 Tes 

where y(r) is the potential found from the Thomas—Fermi equation. Since the 
total energy Æ is always negative, the total potential energy must be essen- 
tially negative, U,țp < 0, or 


A2 
lely(r)r2 > Sy (43)? : (71.10) 
Passing to the dimensionless quantities x and x, we have instead of (71.10) 
73 XO) > 41)? (71.11) 
Gn) 


From fig. V.19 it is seen that the quantity xx(x) is limited and has a gently 
sloping maximum. For large x the potential x(x) decreases more rapidly than 
xl; for x > 0, xx(x) is also equal to zero. Hence inequality (71.11) for given 
l is fulfilled only for a sufficiently large value of Z. This means that the curve 
Ure lies entirely above the abscissa for a sufficiently small Z and goes below 
the axis for a sufficiently large Z. There cannot be states with Upp > 0. 
Hence the limit of realizable states is determined by the condition of the 
curve U,fp being tangent to the abscissa, i.e. by the fulfillment of the condi- 
tions 

dUefr _ 


Uer = 0, T 20s (71.12) 





or 
2 e4 5 1\2 
zix (+) (43), 


Zin KX) —xx()] = -2 a 0). 


To each value of / there corresponds a certain critical value of nuclear charge 
Zi, for which the conditions (71.12) are fulfilled. 

It is easy to eliminate x’ and x from these equations, after which we find 
the relation between / and Z orit 


Zo, = 0.155 (2/+1)3 . (71.13) 


crit 





§71 THE STATISTICAL MODEL OF THE ATOM 291 


Setting 7=1, 2, 3, ... in this formula and rounding off the result to the 
closest integer, we find the values Z,,;, for which the states with the above 
angular momenta begin to be occupied. These values are respectively 


Zig =, 21, 58, 124, (71.14) 


In §73 it will be shown that this result is of great importance for understand- 
ing the properties of complex atoms. 

Another important application of the Thomas—Fermi method is the study 
of the properties of positive ions. In this case it is to be expected that because 
of the predominance of the nuclear charge the electron shell will be com- 
pressed and the electron density will decrease so rapidly that one can in- 
troduce a finite radius of the electron shell, R*. Outside the ion, for r> R*, 
an electric field with the potential 

poe, TRS, 


must exist, where the quantity o = |charge of the shell|/charge of the nucleus 
is called the degree of ionization. 
Forr = R* the potential is equal to 


_Zie\(1—oa 
po =. 


Correspondingly, the energy of an electron at the surface of the ion is equal 
to epg- 
The condition that the electron be bound in the ion assumes the form 


N 


Y 


E= 


| 


+ ep eyo 


N 


m 


instead of Æ <O for the neutral atom. Correspondingly, the maximum mo- 
mentum pmax ÍS equal to 


Pmax = [Zmelyo — ¥)]? 

and eq. (71.5) for the ion assumes the form 
1 d2(ry) _ 4e(2me)? (Yo —y)? 
rao 3mh? 


Its integration taking into account the boundary condition at the surface 
of the ion y= ọọ and condition (71.6) may be carried out numerically, as was 








292 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


done for the atom. The curve x(x) for the rubidium ion is shown in fig. V.19 
by the solid line. The curve x(x) intersects the abscissa at the point x* = R*/d, 
where R* is determined by the condition 
* 
4n f pr? dr = —Zeo . 
0 

Fig. V.21 shows the radial electron density distribution for the ion Rb* 
calculated by the Thomas—Fermi method and according to Hartree (dotted 
line). We see that in the peripheral part of the ion the agreement between 
the curves is better than for the atom. 





Fig. V.21 


The limits of applicability of the Thomas—Fermi method are closely related 
to those of the quasi-classical approximation. 
Upon substituting the expressions 
2 


Ze. 1 
U=e Sarr > P ~ Pmax = (2meg)? 


into formula (39.23) the criterion of applicability of the Thomas—Fermi 
method is given by the condition 

h2 Teli 
Ze2m Z 
At large distances r ~a the quasiclassical approximation is invalid. Thus the 
Thomas—Fermi method is useful for r in the interval 





r> 


a/Z<r<a. (71.15) 


ba 


un 
N 
N 


QUANTUM NUMBERS FOR ELECTRONS IN ATOMS 293 


N 


§72. The quantum numbers characterizing the states of electrons in atoms 

We now turn to a discussion of the properties of many-electron atoms. 
It is obvious that in such an atom (a system consisting of a nucleus and several 
electrons) the laws of conservation of total energy, total angular momentum, 
and the component of the angular momentum along an arbitrary axis must be 
fulfilled. By analogy with the theory of the hydrogen atom one can introduce 
quantum numbers defining the values of the conserved quantities. At first 
sight it seems that the quantum numbers must characterize the system as a 
whole, since, generally speaking, neither the energy nor the angular momen- 
tum of an individual electron is conserved. However, the self-consistent field 
method allows one to consider electrons as independent particles (the wave 
function of the system is the product of the wave functions of individual 
particles) in an external field. Each of the electrons moves in the self-con- 
sistent spherically symmetric field of the nucleus and of other electrons. 
Since for motion in a spherically symmetric field the energy, the angular 
momentum and a component of it along an arbitrary axis are conserved, then 
not only the atom as a whole but also an individual electron can be 
characterized by quantum numbers 7, /, m. The self-consistent field of the 
atom is not a Coulomb field, hence the energy levels will depend on/ as well 
as on n. The energy of the electron does not depend on the orientation of 
the angular momentum in space and, consequently, cannot depend on the 
quantum number m. 

Thus we see that in order to characterize the state of an atom it is 
necessary to indicate the state of each atomic electron. States with angular 
momentum /= 0, 1, 2, 3, ... are denoted respectively by s, p, d, f and so on. 
The principal quantum number is indicated in the form of a number standing 
in front of the letter. For example, the notation 5f indicates that in the 
given state the electron is characterized by the quantum number n = 5 and 
has orbital angular momentum /= 3. If several electrons are in a state with 
the same numbers n and /, then for simplicity their number is indicated in 
the form of superscript. For example, the normal state of nitrogen is 
characterized by 1s22s22p3. This means that two electrons have the quantum 
numbers n = 1, /=0; two other electrons are in the state n =2,/=0 and, 
finally, three electrons are in the 2p-state. However, this information is in- 
sufficient for the complete description of the state of the atom, because it 
does not tell us how the orbital and spin angular momenta of the individual 
electrons are combined and what the total angular momentum of the atom is. 

We have already mentioned that the total angular momentum of the atom 
is conserved in time and that because of this it can characterize stationary 


294 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


states. Furthermore, it can be assumed (neglecting the weak interaction) that 
the total spin and the total orbital angular momentum of the system are 
separately conserved. It is just these three quantities which are chosen. to 
characterize the system as a whole. The symbols for the states of atoms with 
different total angular momenta L are introduced by analogy with the sym- 
bols for individual electrons. Namely, for L = 0, L = 1,1 =2,L = 3 and so on 
the states are called respectively S-state, P-state, D-state, F-state and so on. 
The value of the total angular momentum J is indicated in the form of a sub- 
script on the right of the symbol of the orbital angular momentum. For 
example, P3)9 means that the atom is in a state with orbital angular momen- 
tum Z = l and total angular momentum J = 3/2. Usually a quantity equal to 
2S+1, where S is the total spin of the atom, is also indicated. The value of 
2S+1 is indicated in the form of a superscript on the left of L. The quantity 
2S+1 for L >S shows the number of close levels of the atom constituting its 
fine structure. Indeed, from the rule of addition of angular momenta it 
follows that if L >S, then only 2S+1 different states may arise when the 
orbital angular momentum and the spin angular momentum are combined 
to obtain the total angular momentum of the system. 

It turns out that these 2S+1 states have close but different energies. In 
other words, 2S+1 levels form a multiplet. The difference in the energies of 
the components of the multiplet is associated with the so-called spin—orbit 
interaction. This is the interaction dependent on the mutual orientation of 
the orbital and spin angular momentum vectors. 

In §118 it will be shown that the relativistic equation for the electron 
(the Dirac equation) enables one to calculate the spin—orbit interaction. 

If, however, one allows for the very fact of the existence of this interac- 
tion, i.e. if one assumes that the orientations of the orbital and spin angular 
momentum vectors are not independent, then the law of interaction can be 
established from very general considerations. The spin—orbit interaction 
operator must be a scalar made up of the vectors Land S. The only scalar 
combination is the quantity L-S. For the mean energy of the spin—orbit 
interaction we then obtain 


Es-L7AL- S 


where the coefficient A can be either positive or negative. The mean value 
of L-S will be calculated in §74. Formula (74.4) gives 


Es_ =A '[JUt1) —L(L+1) —S(S+1)] 


For different levels belonging to a given multiplet the quantities L and S do 
not change. Hence for the multiplet splitting we obtain 











§72 QUANTUM NUMBERS FOR ELECTRONS IN ATOMS 295 
AEg_ 1 =AJ(V+1). 

The spacing between the neighbouring components of a multiplet is 
AE = EQ, -EŞ P=A. 


If A'> 0, then the lowest level in the multiplet is the level with the lowest 
possible value of J (i.e. J = |L — SI). These are the so-called normal multiplets. 
This case is realized for those atoms whose open shell is more than half filled. 
Otherwise it turns out that A’ < 0. These are the so-called inverted multiplets, 
in which the lowest level has the largest total angular momentum J (i.e. 
J=L+S). 

The absolute value of multiplet splitting is proportional to Z? and rapidly 
increases in going to heavy atoms 

It is often important to know the total number of possible states of an 

tom when the quantum numbers n and } of each electron are given. For this 
‘urpose it is convenient to use the concept of equivalent electrons which was 
irst introduced by Pauli. 

Equivalent electrons are those electrons which have the same quantum 
qumbers n and /. 

Where the electrons are not equivalent the calculation of the possible 
terms is extremely simple. 

Let us consider as an example two electrons in a state with n =3,/=2 
and n = 2,/= 1. On the basis of the rule of addition of angular momenta the 
orbital angular momentum of this system can take on the values L = 1, 2, 3, 
while the total spin of the system can assume two values S = 0, 1. Thus we 
have the terms !P, 3P, 1D, 3D, 1 F, 3F. However for equivalent electrons one 
has to take into account the Pauli principle in calculating possible terms, and 
this makes the calculations somewhat more complicated. We consider first 
the following simple example. Let two electrons be in a state n} =”, and 
1, =0, 1, =0. In this case the components of the angular momenta in an 
arbitrary direction are also equal to zero, i.e. m} = 0, m= 0. To satisfy the 
Pauli principle, sz, and sz, must have opposite signs. Consequently, we may 
have, for example, sz, =7, Sz, = —}. But in correspondence with the prin- 
ciple of identity sz, = —}, Sz3 =4 also represents the same state. 

The states sz, =3 and sz, =4 are forbidden by the Pauli principle. Hence 
only the term 1S can be realized. The term 3S is forbidden. This calculation 
shows that He, as well as Be, Mg and Ca and analogous elements cannot have 
a triplet ground level. We note here that historically Pauli arrived at the 
exclusion principle by investigating atomic spectra. The exclusion principle 





296 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


was discovered as a result of the necessity to account for the absence of 
certain terms. 

For what follows we shall need one more example. Let a system of two 
electrons have the quantum numbers 1, =73; l} =/> = 1. In this case each 
electron can be in the following states: 


ioe 
23 


1)m=1,s,=4; 2)m=0,s, =3; 3)m=-1,s, 


4)m=1,s,=-3; 5)m=0,s,=-4; 6)m=—1,s,=-}. 

In calculating the possible states of the whole system one can combine only 
different states of individual electrons. This must be done so as not to violate 
the Pauli principle. According to the rule of addition of angular momenta we 


obtain the following possible states with M, =m, + m32; S, = Sz; t S23: 


1) M, =2,S,=0; DY N5 S58 WEE Sas Sy 

4) M, =0,S,=0; 5) M, = 0,S,=0; ALL, SOs = Is 

7) Me TiS — Ob; 8) M, =0,5,=0. 

We have not written down analogous states having negative values of the com- 
ponent M,. 


In analyzing the results one has to begin with the state with the largest 
component M,. In the case given we have the state with M, = 2, S, = 0. 
Hence we conclude that there must be a 1D term (to which there correspond 
also states with M, = 1, S, = 0; M, = 0, S, = 0). 

After eliminating the states numbered 1), 3) and 4) from the table we 
again choose the state with the largest component M,. In this case M, = 1, 
S,= 1. The term 3P corresponds to this state and also to the states denoted 
by 2), 6), 3).and 4). Finally, only the state M, = 0, S, = 0, corresponding to 
the term !S, remains in the table. 

Thus we see that a system of two equivalent electrons with /; = 1, /, = 1 
can be in the states 1D, 3P and !S. In computing states in more complex 
cases one has to proceed in an analogous way. 

We now discuss some general regularities in the ordering of the energy 
levels of an atom. If the electrons are in states with definite numbers n and / 
(in such cases one says that the electron configuration is given), then to such 
a distribution there may correspond several different energy levels differing 
in the total orbital angular momentum as well as in the total spin of the 
system. 

Taking into account the multiplet splitting, the state of the atom turns out 
to depend on the quantities J, L and S. 


§72 QUANTUM NUMBERS FOR ELECTRONS IN ATOMS 297 


The state of the atom as a whole, represented by the atomic term, is 
determined by these quantities. The symbol for the term is 2S+1 L}. For ex- 
ample, the normal term of the nitrogen atom (L = 0, S=3, J = $) is written 
as 4S3/2- 

The ordering of terms of different multiplicity was obtained from calcula- 
tions carried out by the Hartree—Fock method (although historically it was 
established much earlier by Hund). 

It turns out that of all the terms of a given configuration the lowest energy 
is possessed by the term with the largest value of the total spin S. 

For a given § the term with the largest value of L has the lowest energy. 

In the example just given of the terms 1D, !S and 3P the order of the 
terms in increasing energy will be 3P, !D and !S. As to the ordering of levels 
within a given multiplet, there are two cases. The first of these, called the 
multiplet with normal structure, is characterized by an increase in the energy 
of the levels with increasing total angular momentum L. In the second case, 
where the energy of the levels decreases with increasing L, the structure of the 
multiplet is said to be inverted. It turns out that if the number of equivalent 
electrons in the atom or ion is lower than the total number of electrons, then 
the multiplets have a normal structure. In atoms or ions in which the number 
of equivalent electrons is larger than or equal to one half of the total number 
of electrons the multiplets are inverted. For example, in the oxygen atom of 
the eight electrons four are in the 2p-state (structure 2p4) and are equivalent. 
Hence in the case of the oxygen atom the multiplets are inverted. In the 
oxygen ion O?- there are 2 electrons in the 2p-state (configuration 2p?), and 
the multiplets have a normal structure. 

We now introduce the concept of an atomic shell. This is the set of all the 
electron states with the same values of the quantum numbers 7 and J. If all 
states with the quantum numbers 7 and / are occupied, then the correspond- 
ing shell is closed. It is known that for given m and / there are in all 2/+1 
different states differing in the quantum number m. If the spin is also taken 
into account, then the total number of electrons necessary to fill the shell will 
be equal to 2(2/+1). If the shell is closed, then the total spin of the system, as 
well as the components of the orbital angular momentum, must be equal to 
zero. In this case S = 0, L = 0, J = 0. This can be shown by taking into account 
the Pauli principle and recalling that in a completely filled shell all possible 
states with positive as well as negative projections of the angular momentum 
onto the z-axis are occupied. The term ISo corresponds to a closed shell. 

We stress that our preceding considerations were based on the assumption 
that the orbital angular momenta of the electrons were combined into a total 
orbital angular momentum of the system, and that the spin angular momenta 





298 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


were combined into a total spin angular momentum of the system. Such an 
assumption corresponds to the statement that the interaction between the 
spin and orbital motions of electrons is much weaker than the interaction 
between the spins. Then one can speak of approximate conservation of the 
total orbital angular momentum and of the total spin of the system. This type 
of interaction is called normal or Russell—Saunders coupling. On the basis of 
the assumption of normal coupling it turns out to be possible to systematize 
the lowest energy levels of most atoms. Deviations from normal coupling are 
observed in the seventh and eighth groups of the periodic system of the 
elements. 

In principle another limiting form of coupling, usually called jf coupling, 
is also possible. In jj coupling the orbital angular momentum and the spin 
angular momentum of each electron are added to give the total angular 
momentum j of that electron (in this case the orbital angular momentum of 
an individual electron is not conserved). In their turn the total angular mo- 
menta of the individual electrons are added together to give the total angular 
momentum of the atom, J. Such coupling is not encountered in atoms in its 
pure form. 

Let us consider some examples of different modifications of the basic 
forms of coupling in atoms. If an electron is in a highly excited state and, 
consequently, is sufficiently far from the nucleus and other electrons of the 
atom, then the behaviour of this electron can be considered to be independent 
of the rest of the atom. In this case the total angular momentum of the indi- 
vidual electron can be considered to be conserved independently of the total 
angular momentum of the rest of the atom. 

Consider another example. For atoms with a large charge Z the inner 
electrons interact strongly with the nuclear charge and relatively weakly with 
the outer atomic electrons. Hence one can assume approximately that the 
inner electrons do not interact with the other electrons (the total angular 
momentum of such electrons is conserved). In this case one can speak of jj 
coupling. We note that such electrons must be characterized not by the 
quantum numbers 7 and / but by the quantum numbers 77 and j. 

For use in what follows we shall now show that the electric dipole 


moment of an atom 
N 
d= Dy fer ry yest peeaty)I2 AV) dV>...dV, , (72.1) 
i=] 


in a stationary state with definite parity is equal to zero. Indeed, since the 





§73 THE PERIODIC SYSTEM OF THE ELEMENTS 299 


parity operator commutes with the Hamiltonian, the wave function y is an 
eigenfunction of the parity operator. In other words, it satisfies the relation 


WE Ezp) = VCE 1-1 25---T,) 


or 


W(t 1,0 5--4¥,) = VEEE) 


In both cases the function ||? is an even function. It is now obvious that the 
integrand in (72.1) is odd and, consequently, the dipole moment of an atom 
is equal to zero. 


§73. The periodic system of the elements 


In its time the theoretical construction of the Mendeleyev periodic system, 
carried out by Bohr in 1922, was one of the most effective results obtained by 
means of the quantum theory. 

The construction of the periodic system of the elements is based on three 
assumptions: 

(1) The structure of atoms is determined by the atomic number Z (the charge 
of the nucleus). 

(2) As the atomic number and the number of electrons in the atom increase 
the electrons fill the states with the lowest possible energy. 

(3) The occupation of energy states is limited by the Pauli principle. 

In §72 we have defined the term atomic shell. A closed shell contains 
2(2/+1) electrons. The energy of an atom depends only on the quantum num- 
bers n and /. Thus all 2(2/+1) electrons in a shell have the same energy (if we 
do not take into account the weak spin—orbit interaction). 

The set of sub-shells with fixed principal quantum number » is called a 
shell. The number of electrons filling a shell is equal to 


n-1 
2 2) (2/+1)=2n2. 
1=0 


Each shell is denoted by letters taken from the classification adopted in X-ray 
spectroscopy. That is, as follows: 





300 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


n 1 2 3 4 5 
Symbol of the shell K Ie M N O 
Possible number of electrons 

in the shell 2 8 18 32 50 


In contrast to the hydrogen atom, in other atomic systems, the energy of 
states is defined by both the principal quantum number v7 and the orbital 
number /. The dependence of the energy on n is, generally speaking, stronger 
than the dependence on /. This means that for all values of / states with a 
given value of n lie lower than states with the quantum number 7 + 1. The 
sequence of energy states is 


lg AAE ses. 


However, in going to d-states and particularly to f-states the situation 
changes. For large values of the angular momentum the dependence of the 
energy on the orbital quantum number / turns out to be most important. 

The effective potential energy of the electrons arises from the Coulomb 
field of the nucleus screened by electrons, and the centrifugal force. The 
screened potential of the nucleus decreases at large distances substantially 
more slowly than the Coulomb potential and still more slowly than the 
centrifugal potential. 

Comparison between the total effective energy of electrons with small 
angular momenta (s- and p-states) and large angular momenta (d- and f-states) 
shows that there is an essential difference between them. Namely, the curve 
Uprp for 1 = 2 and 3 lies higher than for states with / = O and 1. 

Because of this the minimum of the energy for d- and f-states lies closer 
to the nucleus than for s- and p-states. This means that on the average d- 
electrons and particularly f-electrons move closer to the nucleus, in deeper 
parts of the shell, than s-electrons and pælectrons; d-electrons and f-elec- 
trons are often said to be penetrating. This general property of states with 
large angular momenta leads to the fact that 3d-electrons on the average move 
closer to the nucleus than 4s-electrons. 

On the other hand, the energy in the screened field increases with increas- 
ing angular momentum. Experiment and calculations by the Hartree—Fock 
method show that the energy of the 4s-state lies below the energy of the 3d- 
state. Hence the order of filling states lying above 3p turns out to be as 
follows: 


4s, 3d, 4p, 5s, 4d, 5p, 6s, 4f, 5d, 6p, 7s, 6d, Sf. 


Elements in which the 3d-shell and particularly the 4f-shell and the Sf-shell 


§73 THE PERIODIC SYSTEM OF THE ELEMENTS 301 


are partly filled possess special properties. Since the motion in a non-Coulomb 
field penetrates closer to the nucleus for states with large angular momenta, 
the addition of electrons to the d-shell and particularly to the f-shell does not 
change those properties of atoms which depend on the peripheral electrons. 

Let us analyze in more detail the order in which states are occupied. This 
will allow us to find out which properties of atoms should display a periodic 
trend and which a monotonic trend with increasing atomic number Z. 

The first element of the periodic system is hydrogen. Its normal term is 
S12: 

In the next element (helium) the K-shell having two electrons is filled. It 
is easy to find that, in accordance with the rules discussed in the preceding 
section, the normal term of helium is IS: 

The building of the L-shell begins in the next element (lithium). The 
third electron of lithium goes into a 2s-state. 

In calculating the normal term one need not take into account the elec- 
trons of the filled shell: their spin orbital and total angular momenta are 
equal to zero. 

The normal term of lithium is defined by the single electron of the L-shell. 
Lithium is in the state 2S) )2- 

In beryllium the fourth electron fills the 2s-shell. The normal term of 
beryllium is ! Sọ, as for helium. 

The fifth electron of boron goes into the ATSO: Thus the boron atom 
has the following distribution of electrons: 1s22s?2p. Since the Is and 2s 
shells are filled, the normal term of the boron atom is easily found. It is Pi, ‘ 

In the case of carbon, six electrons are distributed as follows: 13228222 
In order to determine the normal term of carbon, we turn to the example 
discussed in §72 of the determination of terms for two equivalent p-electrons. 
Making use of the Hund rule, we see that the normal term of carbon is 3P. In 
this case the atom contains less than one half of all possible equivalent p-elec- 
trons. Hence the multiplet structure of the lower level corresponds to a 
minimum J, in the given case J = 0. Thus, finally, for the normal term we have 
the symbol 3Pp. 

The order of further filling the terms of normal states is shown in table 2. 
We see that in neon the L-shell is complete and that, like helium, it has no 
electrons in unfilled shells and sub-shells. 

It is natural to identity filled shells with the periods of the Mendeleyev 
system of the elements. Each period begins to be filled by one electron in an 
s-state and is completed when a filled shell is formed. 

The first period of the periodic system of the elements contains elements 
in which the K-shell is filled. It comprises two elements (m=1,/=0). The second 




















302 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 
Table 2 
The distribution of electrons in the periodic system of elements 
Ionization 
Element potential 
(eV) 
1 13.595 
2 24.58 
3 1 
4 2 
5 2 
6 2 
7 2 
8 2 
9 2 
2 
1 
2 
2 
Neon 2 
configuration 2 
2 
2 
2 
1 
2 
2 
Argon 2 
configuration 2 
1 
2 
2 
2 
2 
1 
2 
Ga Argon 2 
Ge configuration 2 
As 2 
Se 2 
2 
2 








§73 








Element 


Ce 
Pr 





39 
40 
41 
42 





50 
51 
52 
53 
54 


THE PERIODIC SYSTEM OF THE ELEMENTS 303 


Configuration 
of inner shells 


Krypton 
configuration 


CoNuUNUnNft NK 









Table 2 (continued) 











Kee NRK KE NNN 
I 





l 
| 
| 


Normal} Ionization 
term potential 
(eV) 

= Sip Hie 4.19 

- |'So 5.69 

- |?Dap | 6.38 

- |>F 6.84 

- Dij 6.88 

= "Se 7.10 

- <Ssj2 7.28 

— sg 7.36 

- |*Fop 7.46 

sil Sova ates 





Paladium 
configuration 


N NNN NNN — 











55 
56 
57 
58 
59 
60 
6l 
62 
63 
64 
65 
66 








The sub-shells trom 
1s to 4d contain 46 
electrons 






1 
` 
` 

` 


The Ss 

and Sp 

sub-shells 
together 
contain 

8 electrons 






















Bmw PMO Sy Wy Sey Oe 





304 ATOMIC AND NUCLEAR SYSTEMS CH9. 


Table 2 (continued) 


































































kement | Contain © |g Rarer 
of inner shells qs (eV) 
= |r 7.00 
Ta 73 = “Fp 7.88 
W 74 | The sub-shells - | Do 7.98 
Re 75 | from Is to 5p a ®Sep 7.87 
Os 76 | contain 68 — |) hoy | 8.70 
7 17 electrons “a. *Fip | 9.00 
= |) Soy 9.00 
1 — | 7Sip 9.22 
2 E S || 10.44 
2 1 = —|7Pip | 6.11 
82 | The sub-shells from EN EG) = | A 7.42 
Bi 83 | 1s to Sd contain = 2 3 = - 43/2 7.29 
78 electrons 2 4 = = 3p, 8.43 
3o 8 = | =|) S24 9.40 
28 56 ESS 10.75 
2 6 4.00 
2 6 5.28 
2 6 5.5 
2 6 5.7 
2 6 $37. 
2 6 4.0 
2 6 
The sub-shells from 2 6 
g5 | 1s to Sd contain 78 2 6 
96 electrons 2 6 
2 6 
2 6 
2 6 
2 6 
2 6 
2 6 
2 6 
2 6 


NN NNNNNNNNNNNNNNN=e 





* THE PERIODIC SYSTEM OF THE ELEMENTS 305 


period contains elements with the L-shell filled. It comprises 8 elements (n=2, 
/=0,1) from lithium to neon. The 3s-states and 3p-states of the M-shell are 
filled from sodium to argon. Up to now periods ending with the noble gases 
He, Ne, Ar have been formed in the Mendeleyev periodic system. In the next 
period, beginning with potassium and ending with krypton, there is a de- 
parture from the simple rules of successive filling. Namely, as we have seen in 
§71, electrons with angular momentum / = 2, i.e. in the d-state, must begin at 
the element with Z=21. Hence in Sc the additional twenty-first electron 
does not go into a 4p-state but into a 3d-state, which proceeds to fill from Sc 
to Ni. It is interesting to note that in Cr the tendency to fill a 3d-state is so 
strong that one of the 4s-electrons goes over into a 3d-state. The filling of 4s- 
states and 4p-states of the fourth period of the system of elements, which 
ends with krypton, begins again after Ni. 

The further simple building of the N-shell, i.e. the fifth period containing 
18 elements up to Xe, proceeds after krypton. In the sixth period, containing 
32 elements, the filling of 6s-, 4f- and 6p-states proceeds. Here again there is 
a more complex order of filling. In Ce (Z=58) the electrons begin to fill the 
4f-states. We note that according to a calculation in the Thomas—Fermi ap- 
proximation the-electrons with /=3 should appear beginning with the ele- 
ment Z = 55. 

The building of the seventh period for elements existing in nature remains 
incomplete. The filling of the deep Sf-state begins with Pa. Up to now there 
are artificially produced elements from Np (Z = 93) to kourchatovium (Ku, 
Z=104). In all these elements, called the actinides, the filling of the Sf-states 
proceeds. 

As the atomic number Z increases all the properties of atoms determined 
by the inner electrons display monotonic changes. As an example we can cite 
the characteristic X-ray spectra. Characteristic X-rays arise when a vacancy in 
one of the inner shells is filled. The X-ray spectrum evidently has a character 
similar to that of hydrogen, but with the Rydberg constant multiplied by Z2. 
The frequency of emission lines increases in proportion to Z2 (the Moseley 
law). 

On the other hand, all the properties of atoms determined by the peri- 
pheral electrons have a periodic behaviour with increasing Z. For example, the 
ionization potential of the atom (see table 2) is one such property. It has its 
smallest value for the first element and reaches its largest value for the last 
element of a period. 

Another property displaying periodicity is the atomic volume. The largest 
volume is possessed by the alkali metal atoms, in which there is one electron 
outside a filled shell. 





306 ATOMIC AND NUCLEAR SYSTEMS &, 


From the point of view of chemistry the most important characteristic of 
an atom is its valence. Chemical binding only involves unpaired electrons. 
Electrons which are in filled states and have a total spin equal to zero do not 
take part in the chemical interaction. 

Hence it follows that the chemical interaction and its qualitative character- 
istic (the valence) are determined exclusively by the number of unpaired 
electrons which are in unfilled states. The numerical value of the valence of an 
atom in a given state is equal to r = 2S, where S is the spin of the atom in that 
state. 

We particularly stress that the valence of an atom is related to its state 
because the atom, when making a transition from one state into another, 
may change its valence. 

In §79 we-shall discuss this problem in somewhat more detail. Here we 
shall confine ourselves to the following remark: if the first excited term lies 
close to the normal term, then the atom may be involved in chemical binding 
in the excited state. From the above it follows that the elements of the first 
group of the periodic system, with which all 7 periods begin, in the Sip 
state have the valence r= 1. 

The elements of the second group which are in the Isp state have zero 
valence. They would be chemically inert if the excited term did not lie close 
to the normal state. In the excited state the two outer electrons have the con- 
figuration s and p, so that the atom has a total spin 1 and a valence r= 2. 

In the atoms of the third group three electrons are outside filled shells. 
Their configuration sp corresponds to a total spin S = 4 and to a valence one. 
However, those of these atoms which have a small excitation energy may 
make a transition into the state with the configuration sp? and spin S=3.In 
the excited state their valence is y= 3. The elements of the first three groups 
are regarded chemically as metals. From the chemical point of view metals 
are characterized by the ability to lose electrons when forming ionic chemical 
compounds. 

The elements of the fourth group enter into a chemical bond in the 
normal and excited states with the configurations s?p? and sp3. The corre- 
sponding values of the spin and valence are equal respectively to S=1,r=2 
and § = 2,r=4. 

The excited state corresponds to the configuration s2p2s and to the transi- 
tion of the fifth electron into the s-state of the next shell (i.e. to a transition 
with an increase in the principal quantum number by one). The spin and 
valence in the excited state are equal respectively to S =$ andr=5. 

In the sixth group the atoms in the normal state have the configuration 
s?p4 with spin S= 1. Their valence is r= 2. When excited, one of the elec- 


§74 THE ZEEMAN EFFECT 307 


trons makes a transition from the p-state to the s-state of the next shell. In 
this excited state r = 4. 

In addition to this type of excitation, excitations with a transition of two 
electrons into the next shell, one from the s-state and the second from the p- 
state, are often brought about. In this excited state the atom has the valence 
r=6. 

In the atoms of the seventh group the normal configuration is s2p>, the 
spin is S =4 and the valence is r= 1. However, transitions of one, two and 
three electrons into the next layer are possible. Hence also the valences r= 3, 
r=5 andr = 7 are realized. 

The elements of the fourth, fifth, sixth and seventh groups which stand at 
the beginning of the groups are non-metals. In compounds of ionic type they 
gain electrons (are oxidizers), having the tendency to form a filled state. 

The elements of the transition groups — iron, palladium and platinum — as 
well as the lanthanides (rare earths) and actinides, have special chemical 
properties. 

The completion of deep d-states and f-states takes place in the atoms of 
the group of iron, the lanthanides and the actinides, d-electrons and f-elec- 
trons usually do not take part in valence bonds and the valence of the atoms 
is determined by the electrons in the outer states. However, this is not a strict 
law, since in certain cases of chemical compound formation, electrons from 
inner states make transitions into outer states and contribute to the valence. 
This is particularly clearly displayed in the case of some actinides. Hence the 
chemical properties of elements of the groups with special properties are 
rather complex. 

Thus we see that not only is the theoretical substantiation of the distribu- 
tion of atoms in the periodic system of the elements possible, but further a 
relatively detailed prediction of their chemical properties can be given. 


§74. The Zeeman effect 


We have seen in §31 of Part I that the full theory of the Zeeman effect 
could not be constructed on the basis of classical electrodynamics. After 
analyzing a vast amount of experimental data, Landé introduced a parameter 
which quantitatively determines the characteristics of the Zeeman effect. This 
parameter is called the Landé g-factor. 

The quantum theory of the Zeeman effect makes it possible to find the 
value of the Landé g-factor and the form of the Zeeman splitting without 
any new assumptions. Let us consider how the positions of the energy levels 





308 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


of an atom change if the atom is placed in a constant external magnetic field. 
The wave function y for the stationary states of an atom can, as usual, be 
written in the form 


Wri = Vir) eE (74.1) 


On substituting (74.1) into the Pauli equation (63.11) the latter transforms 
into the form 


= D 4 ee £l - (L428) - Cte au [XXr;]? +U] Y =EY, (74.2) 


i 





where U takes into account the interaction of the electrons with each other 
and with the nuclei. We assume that the external magnetic field strength is 
sufficiently small that the term containing the square of the field can be 
dropped in (74.2). 

We introduce the quantity H’ given by 


A= en (L428) - H= ——(I+8)- X, (14.3) 


where J is the total angular momentum operator, H is the small perturbation 


acting on the atom. 
The Hamiltonian Å can then be written in the form 


Pao OR. an a DE pe +U. (74.4) 


In the unperturbed state the atom is characterized by a definite total 
angular momentum of the system J and by a definite z-component of the 
total angular momentum, M,. Clearly, one has to apply perturbation theory 
for degenerate states. Indeed, the energy of the unperturbed state does not 
depend on the value of the total angular momentum component M,,. Since 
the perturbation operator represents the projection of a certain vector onto 
the z-axis and is brought to diagonal form simultaneously with the operator 
of the z-component of the total angular momentum, one needs to calculate 
only the diagonal elements of the perturbation operator 





At IAG. 48,). (74.5) 


The diagonal matrix elementis taken with respec: to the quantum numbers 


an 


§74 THE ZEEMAN EFFECT 309 


of the total angular momentum J and the z-component of the angular mo- 
mentum M, =M. 


The diagonal matrix element of the operator J, is equal to 


Cm:sm = 1M - (74.6) 


Consequently, we have to determine the expression 


Sym;sm = Sz - 
The calculation of this matrix element by means of the commutation rules is 
a good example of the practical use of the matrix method. We note that the 
value of S, is usually found from obvious but not quite rigorous considera- 
tions associated with the precession of the vector S with respect to the vector 
J*. We note at first that by definition the operator J is equal to J=L+S. 
The operators Land S commute with each other, since they act on different 
variables. 

Knowing the commutation rules for Ly, L 

the following relations: 

CSS OL {725 S {JS} = iS, 
Other commutation rules can be obtained by cyclic permutation. Then we 
obtain 


y> SA $,,8, we easily find 


a 


O,S =0, psn PpS) =s, (14.7) 
S0 S E S S ES ` 


From these rules there results the relation 

(Vy tid), (Sy tiS,) — (S, 41S, Vx tiJ y) =0. (74.8) 
We calculate the following matrix element of the right-hand and left-hand 
sides of this relation: 
(Sx tidy StS) yes s,M=1 F Ertis Ot 1 M+; M-1 : 


„In correspondence with formula (51.14) the matrix element of the operator 
J, tiJ, differs from zero only in the case of the transition J, M >J, M — 1. 
Hence ` 


etii ym: sm-1 =A -M+1)]: . (74.9) 


Then making use of the rule of multiplication for matrices (45.6), we obtain 


* See L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965). 





310 ATOMIC AND NUCLEAR SYSTEMS Ch.y 
Gti vim. tiSy og; a1 — 8,418, M; Mty Mm: m-1 =- 
(74.10) 
In this formulae we have dropped the suffix J. Using-relation (74.9) we find 
SHS mm _ rtis mm- _ 
CMD- [U+M)V—M+1)}} 


Analogously one can obtain 





Crtis mma 
[+M+2)(J—M— 1)]? 
Hence we see that the quantity A does not depend on M and, consequently, 
we have 


(S418, )m; m-1 = ALUMI DP . (74.11) 


The matrix elements of the operator Se which we need can be found by 
making use of the following formula: 


G, iF (8, 4i5,) - (S415, \J,-id,) = 20S, . (74.12) 
Calculating the diagonal matrix element of relation (74.12), we obtain as 
a result 
Îi hy Jm; m+ Sx tiSy M+; M = Ertis sat s-i maim = 
= —2h(S,)yem - 
Making use of (74.11) and knowing that 
Gi )m; m+ = CHM , 
we can easily determine the diagonal element S.usm which is equal to 
-2(8 yy, m = U4M+1)\J—M) A — (J+M)(J-M+1) A = —2AM , 
m; m= AM - (74.13) 
We now turn to finding the quantity A. From the relation 
J2 = (L+S)2 = È? + 2(L-S) + $2 = L? + 2(5-S) — S2 
it immediately follows that 
ĵ- $= 1882+821?) f 
In the case of Russell—Saunders coupling the diagonal matrix element of the 
scalar product J-S is equal to 


§74 THE ZEEMAN EFFECT 311 


A-S: yai = 272 UUF-LUL +1) +5(S41)] . (74.14) 
On the other hand, the matrix element can be found if the scalar expression 
J-S is transformed to the form 
G8), =16, +15, riS, + 46,15, Frid) + 5,8, . (74.15) 
We find the diagonal matrix element of the right-hand and left-hand sides 
of relation (74.15). We then have 
O-Sm:m = Sx 41S, ars m-1 Ux iy) + 
uh 3(Sx- Sy ars M+ Ux ti, ares M +AM(S2)M; m - (74.16) 
We now need to find the matrix element CiS MMs . This is easily done 
by means of relation (74.11). We note beforehand that the constant A is real. 
This is seen from formula (74.13), in which all the quantities determining the 
quantity A are real. 
Carrying out the complex conjugation of the right-hand and left-hand sides 


of the relation 
(Sy HiSy)arsiga = ALUtMt 1A}? , 
we obtain y x 
SMr; m ~ Sy) Mr; m = 4 IUM) -M)]* 


Making use of the hermiticity of the operators, we find 


Sirim Sy); m = (Sx); m ~ Sy); m+ = 
= (Sy -iS Jag, m+ =A [UME DUM) . (14.17) 
Substituting the values of J: Si mand (Sy +iS yJm;m-1 into relation (74.16) 


in correspondence with formulae (74.14) and (74. 11), and also using (74.17) 
and (74.13), we find the quantity A: 


pI + S(S+1) — L(L+1) 

2(J+1) { 
Using the value found for A and also formula (74.13), we obtain the diagonal 
matrix element CIM: JIM which has the form 


A= 





(74.18) 


JU#1) + S(S#1) = L(L+1) | 





The correction to the energy levels of the atom due to the magnetic field XH 
is given by the expression 
_ lei XM | Ut) + S(S+1) — L(L+1))\ _ le iX nM 
2mc 2J(J+1) ~  2me 





(74.20) 





312 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


The factor g is called the Landé g-factor. 
For singlet levels J = L, S = 0 we have 


AE = lela p 
2mc 
Before turning to the discussion of formulae (74.20) and (74.21) we shall 
establish the limits of applicability of the above derivation. 
The unperturbed Pauli equation 


(14.21) 


a a 1 A 
How =Ey ; Hy =>, 24 7 + U 
i 


defines the energy levels of the atom including also its multiplet structure. 
Thus for the theory described above to be applicable it is necessary that the 
matrix element of the perturbation (74.3) be smaller than the spacing be- 
tween the levels corresponding to the fine structure of the atom. 

It should also be pointed out that in the calculations it was assumed that 
Russell—Saunders coupling holds in the atom, i.e. that the quantity J is con- 
served in time as well as L and S. 

Let us now turn to the discussion of the formulae obtained, which define 
the Zeeman effect. 

From formula (74.20) it is seen that each component of the multiplet 
splits into 2/+1 levels. Indeed, for given J the component of the total angular 
momentum M can take on 2/+1 different values. This is in accordance with 
the result of §54: perturbation removes the degeneracy. As to the distribu- 
tion of the newly arising terms, the following can be stated. 

If J is an integer, then in a magnetic field a level corresponding to the 
value M = O arises at the place of the unperturbed level. 

Of the remaining 2/ levels J levels are distributed above and J levels below 
at equal distances from the level with M = 0. 

If J is a half integer, then the levels are also distributed symmetrically with 
respect to the old position of the unperturbed level, and the closest levels are 
at the distance |e|/aXg/4mc from the initial position. 

We also note that if there is jj coupling, the character of the Zeeman ef- 
fect is much modified. This coupling is seldom encountered in the pure form, 
and we shall not carry out the corresponding calculations here. 


§75. The Paschen—Back effect and the diamagnetism of atoms 


In strong magnetic fields the character of the Zeeman effect changes. 
Namely, as the magnetic field strength increases the spacing between the 





§75 PASCHEN-BACK EFFECT. DIAMAGNETISM 313 


multiplet levels increases. In very strong fields the splitting of a level is so 
large that the spacings between the components of the multiplet arising in 
the field turn out to be large in comparison with the natural multiplet 
spacing. We recall that the latter arises from the spin—orbit interaction. In this 
case formula (74.20) is no longer applicable, and the character of the spec- 
trum changes. This change in the spectrum in a strong magnetic field is 
called the Paschen—Back effect. 

We shall carry out the calculation for the case where the splitting due to a 
magnetic field is large in comparison with the natural multiplet spacing. This 
means that the energy acquired in the magnetic field is large compared to the 
spin—orbit interaction. Then the term taking the spin—orbit interaction into 
account can be left out of the unperturbed Hamiltonian Hg in formula (74.4). 
Therefore the unperturbed states of the atom can be characterized by the 
total angular momentum J as well as by the component L, of the orbital 
angular momentum and the component S, of the spin angular momentum. 

The perturbation operator has, as before, the form 


A =P Gs +8) = m t, +28) K. (15.1) 


The correction to the energy is equal to the mean value of the operator H' 
over states with definite components of the orbital and spin angular momenta, 
ier 


_lehX era 
“Ime t, +25 Zs 


(L,+28,) . (75.2) 
Formula (75.2) defines the fine structure of the spectrum in strong magnetic 
fields. 

Let us now consider the effect of the neglected quadratic term in the 
magnetic field in formula (74.2). Taking this quantity into account is partic- 
ularly important for terms with L =S = 0. In this case no splitting of levels 
occurs on account of the term linear in X. This can be seen from the 
general formula (74.20). In this case the correction due to the quadratic 
term cannot be disregarded. As the perturbation operator, in correspondence 
with formula (74.2), one has to take 


A 2 
H ara D XXr;? . (75.3) 
mc i 


The sum over i corresponds to summing over all the electrons of the atom. 





314 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


The correction to the energy levels due to the operator H is again defined 
by the diagonal matrix element 


i e2 e 








es 2 ee 
D XXr;l? = D (Ar; sind) . 
8mc? ; 8mc2 | 

I I 


In calculating [X Xr]? it should be recalled that the wave function of the 
system L = 0, S = 0 is spherically symmetric, hence 


sin? @ = 1 — cos? 0 =4 . 


Thus for the shift of the levels we obtain 





2%2 = 
aE a (75.4) 
12me? ; 
Since the magnetic moment of the atom can be calculated by means of the 
formula M= —dAE/0& (see (18.1) of Part IV), we obtain 


2 —— 
M=x&; a aoe Die < (75.5) 





t 
Thus atoms possess diamagnetic susceptibility. Since the diamagnetic sus- 
ceptibility is in the main determined by the mean square distance of the 
electrons from the nucleus, x is particularly large for many-electron atoms. 
For such atoms good results are obtained by the Thomas—Fermi method. 
Hence the diamagnetic susceptibilities are often calculated by this method. 
On the other hand, measurements of x represent one of the best ways of 
finding the effective size of atoms. We stress that all atoms and ions have a 
diamagnetic susceptibility. However, in certain ions the paramagnetic sus- 
ceptibility, associated with the magnetic moment, exceeds the diamagnetic 
susceptibility. 


§76. Deuteron theory 


The deuteron, consisting of a proton and a neutron, plays the same role 
in nuclear theory as the hydrogen atom in atomic theory. 

The nuclear interaction between a proton and a neutron may depend on 
their separation z and the relative orientation of the spins, S} and s}, of the 
two particles. The explicit form of the potential energy of the nuclear inter- 





§76 DEUTERON THEORY 315 


action is at present unknown. Hence one has to confine oneself to writing 
the most general expression for the potential energy operator depending on 
r, S} and s3. This interaction operator must not change under rotation of the 
coordinate system. Furthermore, as shown by experiment, the parity conser- 
vation law holds for nuclear forces (see §33). This means that the interaction 
Operator must not change under reflection of coordinates (the interaction 
Operator must commute with the parity operator). Thus we have to make up 
all possible scalars of the three vectors r, $} and s3. The following scalars do 
not change under rotation of the coordinate system: S}*S3, Sį*r and S9°F. 

The products s;-r and s'r cannot be involved separately in the poten- 
tial energy, because the spin vector is an axial vector and the product s-r is a 
pseudoscalar which changes sign under reflection of coordinates. The product 
(S;-r)(S-r) does not change sign under the reflection and, consequently, can 
be involved in the potential energy. Spin operators in higher powers are not 
involved in the operator of the interaction energy U, because the higher 
powers of the spin operators may be reduced by means of formula (60.17) to 
linear combinations of s. 

Thus the expression for the potential energy has the form 


U= U (r) + Un(r(81'89) + U3 0S1 r)(Sx'1) , (16.1) 


where U}, U3, U3 are certain functions depending on the distance between 
the particles. Besides the operator (76.1), representing a potential energy of 
the ordinary type, the interaction between a proton and a neutron may also 
have the character of an exchange force. According to the results of §67, 


Uexch can be written by means of the exchange operator Pj in the form 


Uexch = P12 [Ug C) + Us(r)(S1°S2) + Ug Osi r1 (S2°F2)] - (76.2) 


Here U4, Us and Ug are functions of the distance between the particles 
independent of their spins. For generality it is assumed that the form of 
these functions is different from the form of the functions U}, U2, U3 
involved in the potential energy of the ordinary interaction. The total inter- 
action energy is equal to the sum of expressions (76.1) and (76.2). Available 
data on the stable states of the deuteron, the study of neutron—proton scatter- 
ing etc. do not, as yet, allow one to determine the form of these functions. 
Moreover, there are no grounds for considering any of these functions to be 
small in comparison with the others. Thus even the simplest nuclear system 
turns out to be immeasurably more complex than atomic systems. 
Experimental data already makes it possible to carry out a classification 
of the states of the deuteron. As is easily seen, the Hamiltonian of a system 
of two nucleons (a proton and a neutron) with the interaction energy written 











316 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


above leads to two conservation laws: the total angular momentum conserva- 
tion law and the parity conservation law. 

The states of the deuteron are denoted by the same symbols as the states 
of atoms. States with the orbital angular momentum L=0, 1, 2, ... are 
denoted respectively by S, P, D and so on. The multiplicity of the (2S+1)th 
term is denoted by a superscript on the left (S is the total spin of the 
deuteron). The subscript on the right denotes the total angular momentum J 
of the deuteron. For example, in the state 3P the total spin is equal to one, 
L = 1, and the total angular momentum is equal to zero. 

Let us discuss the possible states of the system taking into account the 
fact that the spins of the neutron and proton are equal to 4. Formal applica- 
tion of the rule of addition of angular momenta leads to the following 
possible states of the system: 


Isp, 1P}, 1D (singlets), 
38, , ID. 3P,, 3P,, 3D, 3Dp, 3D3 (triplets) . 


The S- and D-states are even, whereas the P-state is odd. We have not 
written down states with L > 2. The states realized in nature can be deduced 
only from experimental data. Experiment shows that the ground state of the 
deuteron is an even state with J = 1*. 

Further, making use of the rules of addition of angular momenta, we shall 
establish the possible states of a system with total angular momentum J = 1. 

The total spin of a system consisting of a neutron and a proton can be 
equal either to zero or to one. If the spin is equal to zero, then only one state 
L= l leading to the total angular momentum J= 1 is possible. For a spin 
equal to one the orbital angular momentum can take on three values: LZ = 0, 
1, 2. Consequently, four states are in all possible: 1P}, 3S,, 3P,, 3D}; the 
states IP, and 3P, cannot be realized, because they are odd. 

Further, it is easily seen that superpositions of states such as 3S1 SP] or 
35,+1P, are impossible, since S- and P-states have different parities and the 
wave function corresponding to their superposition is not an eigenfunction of 
the parity operator. 

Thus the deuteron can be either in the state AS or in the state 3D, or in 
a state which is a superposition of these two states. The S-state is spherically 


*See L.D.Landau and Ya.Smorodinskii, Lectures on nuclear theory (Consultants 
Bureau, New York, 1958); A.I-Akhiezer and I.Ya.Pomeranchuk, Nekotorye voprosy 
teorii yadra (Some problems of nuclear theory) (Gostekhizdat, Moscow, 1950) §2 and 
§5. 


$76 DEUTERON THEORY 317 


symmetric. If the deuteron were in this state, then its quadrupole moment 
would be equal to zero. Experiment shows, however, that the quadrupole 
moment of the deuteron is different from zero, although it is small. This 
means that the normal state of the deuteron represents the superposition of 
the spherically symmetric 3S | -state and the asymmetric 3D, -state. 

Knowing the experimental value of the quadrupole moment of the deu- 
teron, one can estimate the contribution given by the 3D) -state to the wave 
function of the deuteron. This contribution turns out to be small. Thus it can 
be assumed that the deuteron is a spherically symmetric system with a small 
admixture of asymmetry brought about by the D-state. 

We shall consider further a rough model of the deuteron in which we 
assume that the potential energy of the interaction between the neutron and 
the proton depends only on the distance between them. In other words, we 
shall retain only the first term U,(r) = U(r) in formula (76.1). We shall dis- 
regard the asymmetry of the deuteron, assuming it to be in the ground state. 
The equation for the relative motion of the neutron and proton can be 
written, in correspondence with formula (14.11), in the form 


2 
pi V2? + uo) | Yo = eVo - (16.3) 


In this case the reduced mass of the system is equal to 


Laa 


u m, Mn 


where m, is the mass of the proton, and m,, is the mass of the neutron. 
Since m, = Mp» we obtain 


p n? 
DEED 
u mp 


As to the potential energy U(r) we confine ourselves only to the general as- 
sumption that it tends to 0 rapidly as r > rg, where rọ is the range of nuclear 
forces. We cannot give the concrete form of U(r) for r<rg since we do not 
know the law of interaction of nuclear forces. 

If the function Wg is sought in the form 


(76.4) 


mx) 
Yo(") re 


then making use of formula (35.16) with / = 0 we obtain the equation for the 
function x(r) 





318 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


-=S: U| x) = ext (76.5) 
aS = = x 5S 
mp dr2 
For r > rọ eq. (76.5) is written in the form 
nh? d2x 
—————— = €&X . (76.6 
my dr2 ) 


We seek the solution decreasing at infinity in the form 





Xa CLC (76.7) 
Substituting (76.7) into (76.6) we obtain the relation for a 
2 m,lel}+ 
-E 2e a-| p | 1 (76.8) 
mp n2 


Then for the wave function we have 


ear 


Wo=C =e (76.9) 





As a characteristic of the size of the deuteron one can choose the quantity 
ry =a! i.e. the distance at which the wave function x decreases by a factor 
e. The distance r} is easily determined from relation (76.8), since the binding 
energy of the deuteron is well known from experimental data to be |e|= 
2.19 MeV. Substituting the value of ñ and m, into formula (76.8), we ob- 
tain r] = 4.3X 10713 cm. Consequently, the wave function Wo of the deuteron 
differs from zero in a range considerably larger than the range of nuclear 
forces (r9*2X 10713 cm). Thus we see that the neutron and proton can be 
observed with a high probability at distances from each other which sub- 
stantially exceed the size of the sphere of action of nuclear forces. 

The dependence of the wave function Wọ on the distance cannot be 
determined in the region r<rg, since the potential energy in this region is 
unknown. 

However, from the general theory of motion in a spherically symmetric 
field it follows that for r> O the function x is proportional to r!*! (see §35) 
and, consequently, in the S-state is proportional to r. Thus at small distances 
the function x tends to zero. 

The constant C contained in the wW-function can be found from the 
normalization condition. For the wave function Yọ for r<rg, we take it in 
the form (76.9), which we assume to be valid over all space. This does not 
introduce a substantial error, since a large part of the normalization integral 


§76 DEUTERON THEORY 319 


refers to the region r>rg. Substituting (76.9) into the normalization condi- 
tion, we find 


C=(a/2n): . (76.10) 


Let us now establish the general relation between the width of the well 
ro and its depth. For this we integrate eq. (76.5) in the range from zero to 
r = ro. As a result of the integration we obtain 


m 





A 5 mp ro 
Xrery = X0 "5 J UOO) ar + 
0 


tel To 
p f xdr. (76.11) 
(> ps 


As can be seen from fig. V.22, the value of the derivative \x’| taken at the 
point r= rọ is considerably smaller than that of the derivative IXp=ol- Further- 
more, one can disregard the binding energy in comparison with the potential 
energy of interaction, i.e. one can assume that |e| <|U(r)| for r<ro. At 
small distances x = Nr, where N is a constant. Then (76.11) transforms into 
the form 


My ro 

-N=—2N f U@)ra (76.12) 
he ò 

or 
ro n2 
f U(r) dr =- — . (76.13) 
m 
0 ’ p 


Replacing the integral in (76.13) by Wares where Up is a mean energy of 
interaction, i.e. the mean depth of the well, we obtain in order of magnitude 





H2 
Up ~- ~ — 40 MeV. 
mpr 
x(r) 
[j 
O reg r 


Fig. V.22 


320 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 
§77. Nuclear-shell theory 


In contrast to atoms, in which the interaction between electrons is of a 
secondary character and takes place against the background of the principal 
interaction (the attraction towards the nucleus) there is no single centre of 
interaction in atomic nuclei. 

On the contrary, all the nuclear particles (nucleons) interact intensively 
with each other through strong nuclear forces. Hence for a long time it 

' seemed that it made no sense to distinguish between the states of the indi- 
vidual particles in the nucleus, and that one could speak only of the state of 
the system as a whole. 

It turned out, however, that a number of the observed properties of 
atomic nuclei pointed to the conservation of the individuality of nucleons in 
nuclei. Apparently, the conservation of individuality of particles in nuclei is 
associated with the fact that the nuclear forces decrease very rapidly with 
increasing distance, and that the kinetic energy of the nucleons in nuclei is 
very large. 

Proceeding from the assumption that each of the nucleons in a nucleus 
moves in the self-consistent field produced by all the other nucleons, it was 
possible to account for a number of important properties of nuclei. 

The self-consistent field of most nuclei is spherically symmetric. Of course, 
the precise law of the potential distribution inside the nucleus is unknown. 
It turns out, however, that the character of the distribution of levels depends 
relatively little on the model of the potential field adopted, provided it gives 
correctly the basic feature of the field of the nucleus as a whole: the sharp 
increase of the potential at its surface r= R. The simplest nuclear model is a 
spherical potential well of infinite depth. In this model the self-consistent 
potential field U in which an individual nucleon moves has the form 


fe for r<R 


i a a al o aar a 


X 

q 
ia 

i 


U= 
œ for rè>R. 


Already this very simplified model allows one to get a general idea of the 
properties and distribution of levels. The wave function satisfies the equation 


ay „2dy +1) jeg 
De Cle a aa a (77.1) 


where k2 = 2mE/h2. The solution of this equation is expression (36.9). The 
boundary condition at the surface of the nucleus 


W=0 for r=R (77.2) 





§77 NUCLEAR-SHELL THEORY 321 


leads to the condition 
Ji4i(KR) = 0. (77.3) 


For a given value of the orbital angular momentum the energy levels, which 
are the roots of the transcendental equation (77.3), are classified by means of 
the principal quantum number 7. To the smallest root of this transcendental 
equation there corresponds a wave function having no nodes for r < R. This 
level is classified as a level with n = 1. The next root of (77.3) corresponds to 
a wave function having one node for r<R. The corresponding state is 
denoted by n = 2 and so on. For a given n, / can take on any value. The energy 
of the nucleon increases with increasing orbital angular momentum. 
The ordering of levels is given by the sequence 


ls, Ip, 1d, 2s, 1f, 2p, 1g, 2d... . 


It turned out that in atomic nuclei an important role is played by the 
spin—orbit interaction, which has so far not been taken into account at all. 
The special role of the spin—orbit interaction is associated with the fact that, 
owing to the rapid decrease of nuclear forces with increasing distance between 
the particles, the energy of pair interaction is on the average small. The con- 
siderable magnitude of the spin—orbit interaction leads to the establishment 
of jj coupling in nuclei. The spin and orbital angular momenta of each nucleon 
are added up into the total angular momentum j. The energy of a nucleon 
turns out to depend on its spin. In nuclei there exists the inverted structure 
of levels, in which levels with large j lie below those with smaller j. The states 
of nucleons are denoted by the symbol nl;; for example ls} /2 OF 2p3/2- Since 
nucleons obey the Pauli principle, 2/+1 neutrons and 2j+1 protons can be in 
each state with given values of n, / and j. Because of this there is a situation 
in nuclei very similar to that in atoms: the states of the nucleons can be 
divided into groups or shells. When each shell is filled a closed configuration 
arises, possessing the greatest stability — in the given case the largest energy 
of binding of the nucleon in the nucleus. The energy distribution of the states 
and the number of nucleons in these states is given by the table 


ls1/2 

Ip3/2, 1P1/2 6 
ld3/2, Idip, 2si pf, 1f1/2 20 
2P3/2; Ifsj2,  2P1/2 183/2 22 
2ds/2, Ig72, Ihiip, 2d3/2, 3s1/2 32 


2f7/2; Iho,  2i13/2> 2f3/2, 3P3/2; 3p1/2 44 


Se SC erhCU CT 


322 ATOMIC AND NUCLEAR SYSTEMS Ch. 9 


From this table it is seen that in the closed shells there are consecutively 2, 8, 
28,50, 82 and 126 nucleons. 

Correspondingly, nuclei with the total number of nucleons given by these 
numbers, which are called magic numbers, possess particular stability. 

Nuclei in which both the number of protons and the number of neutrons 
are magic numbers are particularly stable. Such nuclei are, for example, 4He 
and HO, They are often said to be doubly magic. This very simple scheme 
makes it possible to account not only for a particular stability and natural 
abundance of certain isotopes but also a number of other properties of 
atomic nuclei, for example their magnetic moments. However, in a number of 
cases it turns out to be inadequate. Thus, for example, nuclei with open shells 
display deviations from spherical form. This is exhibited in the presence of 
rotation of the nucleus as a whole. The experimental proof of rotation of the 
nucleus is the presence in nuclear spectra of a structure similar to that of the 
spectra of diatomic molecules. 

We cannot dwell on details here, but the reader may find these in the 
specialized literature*. 


* See, for example, P.E.Nemirovskii, Contemporary models of the atomic nucleus 
(Pergamon Press, Oxford, 1963). 





10 





The Theory of Diatomic Molecules 


§78. The adiabatic approximation and the classification of electron terms 


We now turn to the study of the properties of a more complex system, i.e. 
a molecule. We shall confine ourselves to the consideration of the simplest 
diatomic molecule. The basic properties of molecular systems will be illus- 
trated in this simple example. 

We have seen in the preceding chapter that the calculation for atoms is 
carried out by approximate methods. It is natural that approximate methods 
of calculation should also be widely used in the theory of molecules. The 
Hamiltonian of the diatomic molecule is of the form 


ea yeas 2 Dee 2+ Ur V moll xR) = 
2M 1 2M, vo a 2m pR mol“! ķ> 
are a < (78.1) 


Here M, and M3 are the masses of the nuclei, 7 is the mass of an electron, r% 
are the coordinates of the electrons, and R; are the coordinates of the nuclei. 
The potential energy U(r,,R;) involves the interaction of the electrons with 
the nuclei, the interaction of the electrons with each other and the interaction 
of the nuclei with each other. The summation over k is carried out over all the 


323 








324 THEORY OF DIATOMIC MOLECULES Ch. 10 


electrons of the molecule. The velocities of the nuclei, which have masses 
larger than the electron mass by a factor of several thousand, are substantially 
lower than the velocities of the electrons. Correspondingly, in the theory of 
molecules use is made of the adiabatic approximation (cf. §57). The wave 
function of the system is written in the form 


Ymol F an (R) Yp k-R;) z 


Using formulae (57.7) and (57.8), we can write the equations for the func- 
tions & and yw: 


n2 z > 

[- 2 v2 + vir -R)| Vn = ERI > Aen =EnVn » (78.2) 
n2 n2 2 

[- 2M, vi g 2M3 V5 + £,(R,) |en Ro = Ecx,,(R;) Q (78.3) 


Eq. (78.2) describes the motion of electrons for nuclei at rest. The quantity 
E,,(R;) defines the energy levels of the system for motionless nuclei which are 
at a fixed distance from each other. The energy £,, for a fixed distance be- 
tween the nuclei is called the electron term. 

The Schrödinger equation (78.3) describes the motion of the nuclei. The 
quantity £,,(R,) in it is the potential energy of the nuclei. From eq. (78.3) we 
see that the total energy of the nuclei Æ depends on the state of the electron 
part of the system, i.e. on the electron term. 

The number of electrons in a molecule is always greater than one. Hence 
even the solution of the approximate equation (78.2) is associated with great 
mathematical difficulties which are insuperable in the case of many-electron 
molecules. (An exception to this is the ion H3, for which a precise solution 
of the Schrödinger equation for the electron part of the wave function has 
been obtained.) We are forced, without trying to solve eq. (78.2), to find the 
most general properties of a system of electrons moving in the field of two 
nuclei. 

For this purpose we find, as usual, quantities which commute with the 
Hamiltonian Hen in other words we find quantities which simultaneously have 
definite values in stationary states of the system. In contrast to the atomic 
field, which possesses spherical symmetry, the field of a diatomic molecule 
has cylindrical symmetry. The symmetry axis is the straight line joining the 
two nuclei, which we shall choose to be the z-axis in what follows. The 
potential energy of the interaction of the electrons with the nuclei, as well as 
that of the interaction of the electrons with each other, does not change 
under a rotation through the angle y with respect to the z-axis. Hence the 
Hamiltonian of the system of electrons 





§78 THE ADIABATIC APPROXIMATION 325 


Ay=-*Dvi+u 
k 


does not depend on the angle y. 

Thus we arrive at the conclusion that the component of the total angular 
momentum of the electrons along the axis of the molecule is conserved. 
Disregarding the weak spin—orbit interaction, it can be assumed that the 
component of the orbital angular momentum of the electrons in the z-direc- 
tion is also conserved. The states of the electrons are classified according to 
the eigenvalues of the operator L,. The eigenvalues of the z-component of the 
orbital angular momentum of the electrons are denoted by the letter A. States 
with A= 0, 1, 2 are called D-, TI- and A-states (in analogy with the S-, P- and 
D-states of atoms). The total spin of the electrons S is also conserved for a 
system of electrons in a molecule. 

All the reasoning about the total spin which was presented in the theory of 
the atom applies also to the molecule. 

As in the case of an atom, the multiplicity 2S+1 of an electron term of a 
molecule is indicated in the form of a superscript on the left of the quantum 
number A, i.e. in the form 25*1 A. 

Further, we shall show that the Hamiltonian H, e] does not change under the 
reflection of the coordinates of the electrons in aay plane passing through the 
nuclei of the molecule (i.e. through the z-axis). In other words, we shall show 
that the Hamiltonian commutes with the reflection operator. This can easily 
be seen if, for example, the plane in which the reflection takes place is chosen 
in such a way that it passes through the z-axis and y-axis. In this case the 
reflection corresponds to the replacement of all coordinates x; > —x;. But, 
since the interaction depends only on the distance between the particles 
(x1 —x)?, it becomes evident that the Hamiltonian commutes with the re- 
flection operator. Hence it follows that the operator H., and the reflection 
operator in a plane passing through the axis of the molecule have common 
eigenfunctions. Therefore stationary states can be characterized, in addition 
to the eigenvalues A, by the eigenvalues P; of the reflection operator. The 
latter, as is easily seen, takes on two values P;= +1. However, things are 
complicated by the fact that the angular momentum component operator iL 
does not commute with the reflection operator. Indeed, the operator of the 
z-component of the angular momentum has the form 


L, = —ihy & tins &. 


The x-coordinate will change sign under reflection in the zy-plane, whereas 








326 THEORY OF DIATOMIC MOLECULES Ch. 10 


the y-coordinate will not. Hence it follows directly that the operators Ê; and 
L, do not commute. Therefore molecular terms cannot be simultaneously 
characterized by means of the quantities A and P;, except for terms for which 
L,=0. In this last case, states with parity P;= 1 and P;=—1, which are 
denoted by 5* and X7, are possible. 

The wave functions corresponding to these states change and do not 
change sign respectively under the action of the operator P,, corresponding to 
reflection in a plane passing through the nuclei of the molecule. 

Let us now consider the particular case of a molecule with identical nuclei. 
If the origin is chosen to be at the point which lies on the z-axis halfway 
between the nuclei, then it is easily seen that the operator of inversion of the 
electron coordinates (corresponding to the replacement of all coordinates of 
the electrons by the inverse coordinates r; > —r;) commutes with the Hamil- 
tonian Hy. Since at the same time the erato Ê, of the reflection of elec- 
tron coordinates in a plane passing through the nuclei of the molecule 
commutes with Ha, the state with L = 0 can be characterized by three eigen- 
values A = 0, P;=+1 and the eigenvalues of the inversion operator (see §33). 
The latter has two values +1 denoted by the letters g (even state) and u (odd 
state). These indices are written as subscripts on the right. For example, 
ae corresponds to a term whose wave function is even and does not change 
sina under the action of the operator of reflection in a plane passing through 
the z-axis; the z-component of the angular momentum is equal to zero; the 
term is singlet. Further, we know that the inversion operator commutes with 
the operator L,. Hence the states II and A also can be both even and odd. In 
other words, the states II}, Ig, AT Ag and so on are possible. 

Let us dwell on the problem of degeneracy of the electron terms. If A is 
defined, then this means that the absolute value of the z-component of the 
angular momentum is defined. Since the energy of the system cannot depend 
on the orientation of the angular momentum component with respect to the 
z-axis, i.e. is the same for L, = +A and L, = —A, we arrive at the conclusion 
that each term with L, #0 is two-fold degenerate. Finally, we point out that 
the energy of the electron terms of the molecule is of the same order as the 
energy of atomic terms. 


§79. The hydrogen molecule. Ideas of the theory of chemical binding 
The only molecule for which one can obtain a reasonably accurate solution 


of the equation for the electron term is the hydrogen molecule. This calcula- 
tion is of great theoretical importance. 


§79 HYDROGEN. CHEMICAL BINDING 327 


If the energy is measured with respect to the energy of the separated 
motionless atoms, then the energy values of the electron terms of the stable 
molecule have negative values. The energy (negative) of the molecule is a 
measure of the chemical binding of the constituent atoms. Thus the calcula- 
tion of the electron terms of the molecule represents at the same time a 
quantitative theory of chemical binding between atoms. 

The establishment of the nature of chemical binding is one of the funda- 
mental results of quantum mechanics. 

Before the appearance of quantum mechanics there were no substantiated 
concepts of the nature of chemical binding, in particular of the nature of 
homopolar molecules. We recall that homopolar molecules are molecules 
made up of neutral atoms. For example, molecules containing identical atoms 
are of this type. We shall try to account for certain characteristic features of 
the theory of chemical binding in the example of the hydrogen molecule. 

The Schrodinger equation for the electron terms of the hydrogen molecule 
is of the form 

2 2 2 2 2 2 2 2 
E yee eee E S Gea) 
ay a by bz 12 
Here R is the distance between the nuclei of the hydrogen atoms; the quanti- 
tiesia Yano Tbj’ "b, are respectively the distances between nucleus a and the 
first electron, nucleus a and the second electron, nucleus b and the first elec- 
tron, and nucleus b and the second electron; andr; is the distance between 
the electrons. 

If the atoms forming the molecule are placed an infinitely large distance 
apart, then it can be said that one of the electrons, for example N}, will be 
bound to nucleus a, and the other (electron NV) to nucleus b. By virtue of the 
identity of electrons, such a statement makes no sense when the atoms are 
brought together. 

An exact solution of the Schrodinger equation (79.1) involves great 
mathematical difficulties. Hence a number of approximate calculations have 
been carried out. We shall make use of perturbation theory, which allows the 
basic properties of the system to be elucidated relatively simply. The question 
as to the degree of accuracy of the calculation will be discussed later. We 
choose as the wave function of zero order approximation the wave function 
of the system with infinitely distant nuclei. For an infinite distance between 
the nuclei (R>) the wave function of the two electrons and two nuclei has 
the form 


91 = Valta, Volto.) > (19.2) 








328 THEORY OF DIATOMIC MOLECULES Ch. 10 


where Valfa,) and W,(r,,) are the wave functions of the hydrogen atom in 
which respectively the first electron is near nucleus a and the second electron 
is near nucleus b. 

Evidently, these functions satisfy the equations 


n2 e2 
(-3,93- Z) valta, ) = EoValta,)» 
a (79.3) 


hz o2 e? ns 
ae Veth) = Eqvp(tp,) - 
b2 
We are interested in the normal state of the hydrogen molecule. Therefore 
Eg must be understood to be the lowest energy level of the hydrogen atom. 


Indeed, the same energy is possessed by the state 


99 = Valta) Vap, )> (19.4) 


which differs from the first state by electron exchange. We stress that o? and 
p are eigenfunctions of different operators and are not orthogonal to each 


other. 
We write the wave functions of the zero order approximation in the form 


of symmetrized combinations of the functions 0 and 3, i.e. as 


VO =A, [Vaa ) Volp) + Yalta, ¥o(ts,)] > 
(79.5) 


VP = Ay [Valta Vab) Vata, ) Yon, )] - 
The constants A, and A, are defined by the normalization condition 
fiw9i2 avava = fyt aran=. 
They are equal to 
Ay = (204s)? , Aa = R0, 
where the quantity s represents the degree of non-orthogonality of the func- 
tions o and p and is equal to 
s2= [ly avanz (19.6) 


For such a choice of unperturbed functions, which represent the sym- 
metrized wave functions of individual atoms, the perturbation operator is 
expressed by 2 


§79 HYDROGEN. . CHEMICAL BINDING 329 


Direct application of perturbation theory is inadmissible: the zero order wave 
functions are not orthogonal to each other. Therefore it is necessary to 
modify the perturbation theory somewhat. We write the perturbed wave 
functions and the energy of the perturbed system in the form 


Ve=wotyy, 
Wa=Vety,, (79.7) 
Jey Shey as ar 


Then eq. (79.1) for the function w! is written in the form 
q s 


Te Ope? h2 ; 
am (VIFVIA [Vaa Walt.) + Yalta, Volta, I + 55, (VTFV IY, + 





2 2 2 2 2 2. 
Cee Ce A E en 

+ EB +e + Pap +- ] x 
R r r I Up 


X A1 [Valta Volo) + Vala, Wy, )] iF 





2 2 2 2 2 
G O Cae a Z 
[eote +- TTE +Z ]y;=0. 
R Mii Mein o de Ap 
Using eq. (79.3) and dropping small terms containing the product of the 
perturbation operator with the perturbed wave function, we obtain 


2 2 2 e2 e2 ; 
so + +— +— +E J.= 
| (VitY 5) as} Eo] Ys 


2m 
2 2 2 2 
= e e 0 (2 e 0 
So ae Sl ee 
lal R ALEI ar J+ 


a2 
+A(= +) (79.8) 


An analogous expression is obtained upon substituting W,- 
For further calculations we shall make use of the following general 
theorem. In order that the solution of the equation with the right-hand side 


(Hy -ERY =Y 
may exist it is necessary that the right-hand side y be orthogonal to the 
function ye satisfying a homogeneous equation. 
(The proof is particularly easy for the case of a non-degenerate spectrum. 
We write the equation in the form 








EL ——————< p E a ve a. 


330 THEORY OF DIATOMIC MOLECULES Ch. 10 
(Hy -EW =e, (79.9) 


where Ho is a linear operator having the non-degenerate spectrum of eigen- 
values 
Ave = ERE - 
0. 


We expand the function y in terms of the functions W;: 
y= a,v? 5 
Substituting into (79.9), we have 


D a, E-E?) =y. 

k#n 
Calculating the integral [vied, we find that it is equal to zero. This proves 
our statement.) 

Applying this theorem to (79.8), we require that the right-hand side of 
(79.8) be orthogonal to the solution of the homogeneous equation, i.e. to the 
unperturbed eigenfunction g9. Multiplying the right-hand side of (79.8) by 
y? and integrating, we find 


e? e? 
f (e -— -5) [yfteS ly} av + 
2 


aa 
e2 2) 02 e2 | 0.0 
a ae (a = 
Wee oar fi me re arao 


where the plus sign refers to the symmetric function. Analogous calculations 
lead to a formula with minus sign in the case of the antisymmetric function. 
Taking into account the normalization condition, we have 


Solving this equation for e, we find the correction to the energy 


po JK 
Jts2° 





(79.10) 


where 


§79 HYDROGEN. CHEMICAL BINDING 331 


mre he 2 Tea eae 
JR) =e J vaa 02.) (k nin ag) a 
K(R) =e? f Walta, a(t, Yalta, Vo (Tp, ) X (79.11) 
x (ees + ~~) av a¥ 
R iy Ui a “ 


In formula (79.10) the plus sign corresponds to the symmetric state 
characterized by the wave function we, and the minus sign corresponds to the 
antisymmetric state (the wave function 9). If yo and y? were mutually 
orthogonal, then we would find that s =0. In this case (79.10) would be the 
same as the ordinary formula of perturbation theory for the correction to 
the energy. 

The expressions found for J and K are analogous to the integrals obtained 
in §67. The integral J defines the Coulomb interaction of the nuclei and 
electrons with each other. The quantity K represents the exchange energy. 
The total wave function of the hydrogen molecule represents the product of 
the spatial functions and the spin function. Since the total wave function of 
the system must be antisymmetric, the spatial function symmetric in coor- 
dinates must be multiplied by the antisymmetric spin function and vice versa. 
Hence the state (y?) is a state with zero spin for the electrons (singlet state). 
Analogous reasoning shows that the wave function y? describes a state of the 
molecule with a spin of one. To find out which of these two states is the 
bound state (molecule), it is necessary to find the dependence of the quantity 
e on the radius R. This can be done if the wave functions of the normal state 
of the hydrogen atom are substituted for Wa and w, in integral (79.11). The 
results of the calculation are conveniently presented in a graph (fig. V.23). 
Here £, and £> are the energies of the molecule corresponding respectively 
to the singlet and triplet states. We see that two hydrogen atoms having the 
total electron spin equal to one cannot form a bound state, since E, has no 
minimum. The bound state can only be the singlet state. Knowing the form 
of E| (R), it is possible to determine the binding energy as well as the effective 
size of the molecule. The minimum of the potential energy lies at Ro = 0.794. 
The binding energy is not in very good agreement with experimental values. 
This is due to the fact that the operator chosen as the perturbation does not 
contain a small parameter and is not small in comparison with the unper- 
turbed operator. Therefore quantitative application of the perturbation 
theory is inapplicable. Somewhat better results are obtained by other ap- 
proximate methods. However, the general qualitative conclusions on the 








332 THEORY OF DIATOMIC MOLECULES Ch. 10 


E(R) 


Ez 
a pa o 





Fig. V.23 


nature of chemical binding with the formation of a stable molecule from two 
atoms is correct. The stability of the molecule is wholly determined by the 
magnitude and sign of the exchange integral. Namely, for a stable chemical 
compound to be formed it is necessary (although not sufficient) that the spins 
of the atoms be antiparallel. This is often expressed by the not strictly 
correct statement that ‘the forces of chemical binding are exchange forces’. In 
§67 we have discussed in detail the quantum-mechanical theory of exchange 
forces and the meaning of such formulations. Anyhow, it is beyond doubt 
that the forces responsible for the formation of homopolar chemical com- 
pounds are of a specific quantum-mechanical.Aaracter. It is also often said 
that ‘antiparallel spins are coupled’. The preceding calculation clearly shows 
the large degree of relativity of such a terminology. In the example of the 
hydrogen molecule one can show not only the quantum-mechanical nature of 
the forces of chemical binding but also the difficulties arising in calculating 
the formation of molecular systems. 

The above calculation allows one to draw the general conclusion that to 
each valence in a chemical compound there corresponds a pair of electrons 
with antiparallel spins, bound to each other by the exchange interaction. 

Hence it follows that unpaired outer electrons, which are in open shells, 
are responsible for the chemical properties of atoms. The valence of the atom 
is determined by the number of such unpaired electrons. When a homopolar 
chemical compound is formed these electrons get ‘collectivized’, i.e. can no 
longer be considered to belong to a given atom. At the same time the con- 
figuration of the unfilled state is modified in such a way that it approaches a 
filled structure. In other words, the electrons get paired in such a way that 


§79 HYDROGEN. CHEMICAL BINDING 333 


the spins of all the electrons in the molecule tend to compensate for each 
other. Stable homopolar molecules tend to have all their electrons paired. 
In the most stable molecules all electrons are paired, the total spin of the 
molecule is § = 0 and the multiplicity is equal to one. 

These qualitative results are in good agreement with experimental data for 
most molecules. We should also mention some facts which are important for 
understanding the formation of molecules. The first of these is the property 
(mentioned in §73) of atoms to enter into chemical binding in an excited 
state rather than in the normal state. In the light of the theory just discussed 
this property becomes comprehensible. When two atoms interact, the action 
of the perturbation can cause one of them to make a transition into an 
excited state. If the energy gained in forming a compound with an atom in 
an excited state is larger than that for a compound with the atom in the 
normal state, then the former will correspond to a more stable molecular 
configuration. 

Thus, for example, we have seen in §73 that the stable term of the carbon 
atom is 3P: Carbon atoms have two unpaired electrons in the 2p-state, and 
the valence in the normal state is equal to two. However, the carbon atom has 
an excited state with the configuration 1s22s2p3, in which the atom is in the 
state 5S. This state is higher than the normal state by 4.2 eV. In the 5S-state 
carbon has four unpaired electrons and its valence is equal to four. In com- 
pounds for which an.energy larger than 4.2 eV is gained, carbon has valence 
four. 

Another property of molecules following from the general theory is their 
geometric form. Formation of a chemical compound is related to the values 
of the Coulomb and exchange integrals which contain the products of the 
wave functions. If among these wave functions there are wave functions of 
electrons in the p-state, which have anisotropy in space (see §38), then the 
largest overlap of the wave functions is reached in selected spatial directions. 
As an example we can mention the molecule NH3.The nitrogen atom, having 
the configuration 1s22s22p3 — 4S, has three p-electrons responsible for chem- 
ical binding. The wave functions of these electrons have their largest value in 
three mutually perpendicular directions. In the molecule NH} the angles 
between the bonds N—H are close (although not exactly equal) to 90°. 

The last fact important for understanding the structure of molecules is 
that the wave function of an excited state represents a linear combination of 
the wave functions of electrons. For example, in the excited state of the 
carbon atom mentioned above the wave function is a linear combination of 
the wave functions of one 2s-electron and three 2p-electrons. 2s- and 2p- 
states may be involved with different weights in forming the wave function. 








334 THEORY OF DIATOMIC MOLECULES Ch. 10 


This fact, called the hybridization of states, makes it possible, for instance, to 
understand why all four valences of the carbon atom are completely identical 
to each other. 

Referring the reader to the specialist literature* tor details, we stress that 
very great mathematical difficulties arise in attempting quantitative calcula- 
tions of the forces of chemical binding and of the structure of the molecules 
formed. These difficulties are associated with the fact that the interaction 
responsible for the formation of a chemical bond cannot be considered as a 
small perturbation. 


§80. The interaction of atoms at large distances 


In addition to the forces of chemical character which in certain cases bind 
atoms into molecules, there is a weak interaction between atoms which are at 
relatively large distances from each other. Let us consider two widely sepa- 
rated atoms. Because of their spherical symmetry, the atoms have no mean 
dipole moments. However, the non-diagonal matrix elements of the dipole 
moments are different from zero. 

One can imagine dipole moments of atoms arising in an obvious way as a 
result of the quantum-mechanical motion of the electrons, with the appear- 
ance and disappearance of a dipole moment, equal to zero only on the aver- 
age. As a result instantaneous dipole moments are induced in both atoms. As 
an obvious illustration, one often considers the interaction of two oscillators 
in which the induced dipole moments are directly expressed in terms of their 
zero-point oscillations. 

The interaction energy of atoms can be calculated by perturbation theory. 
As the perturbation operator one takes the energy of interaction of two 
dipoles given by formula (17.12) of Part I. The first order correction to the 
energy is equal to 


y z d,-d,—3(dj-n)(d-n) 
Ey =H, = fv —— 


R3 
d; -d, — 3(d,-n)(d-n) 

= Se 

R3 


* See. for example, H.Eyring, I.Walter and G.Kimbal, Quantum chemistry (Wiley, 
New York, 1944, 1958); U.Kosman, Introduction to quantum chemistry (Academic 
Press, New York, 1957). 


§86 INTERACTION OF ATOMS AT LARGE DISTANCES 335 
by virtue of d} = d; = 0. The second order correction can be written in the 
form 


[d,-d,—3(d,-n)(d>-n)]? 


mn _ A 


EO) — g0) R6 
n 


m 


B= L 
m 


R6 





According to the results of §53, the quantity 


pod aaran 


m POS EO : 
n m 





is always negative. Hence finally 
const 


Cw Se 


V 





This formula expresses the law of the van der Waals interaction. This inter- 
action has no specific character, in the sense that it corresponds to attractive 
forces which decrease as R~’ for all atoms irrespective of their nature. The 
value of the constant can be expressed in terms of the polarizability of the 
atoms and varies for different atoms. Thus the van der Waals interaction 
between atoms represents the same specific quantum-mechanical effect as the 
chemical interaction. It cannot be understood on the basis of classical con- 
cepts, since atoms do not have a ‘permanent’ dipole moment. 

The van der Waals forces, in contrast to forces leading to the formation of 
a chemical bond, possess additivity. If the interaction involves not two but 
three and more atoms, then the energy of interaction of the system, as any 
other weak perturbation, is obtained by the addition of the energies of pair 
interactions. 

This result is of a general character, since we have not used a particular 
wave function. 

However, if the atoms are not in S-states, then they can have a mean 
quadrupole moment different from zero. In this case, in addition to the van 
der Waals interaction a quadrupole—quadrupole interaction ~R7-) will exist 
between the atoms. 

As distinct from atoms, molecules may have a mean dipole moment. If, 
however, its value as well as the value of the quadrupole moment is small, 
then the formula for U,qw also applies to the molecules. 

A particular situation arises when two identical atoms in different states 
interact, say, when an excited and a non-excited atom of one and the same 








336 THEORY OF DIATOMIC MOLECULES Ch. 10 


element interact. In this case an additional degeneracy, associated with the 
possibility of excitation exchange between the atoms, arises in the system. 

The perturbation operator in this case is also the dipole—dipole interaction 
operator. However, the interaction energy is defined not by the mean value 
of this operator but by the solution of the corresponding secular equation 
(see §54). If the given atoms have non-zero matrix elements of the transition 
between the ground state and the excited state considered, then the inter- 
action energy already turns out to be different from zero in the first approxi- 
mation of perturbation theory. In this case the dipole—dipole interaction with 
resonant transfer of excitation takes place between the atoms. As can easily 
be seen, the energy of this interaction decreases only in inverse proportion to 
the cube of the distance between the atoms, U ~ R73. 

Suppose, for example, that one of the atoms is in the ground Isp state, 
while the other is in the excited ups state. We denote the wave functions 
corresponding to these states respectively by y and WY, (the subscript m 
characterizes the angular momentum component in the state Ip, ,m=—-1,0, 
1). Thus a system of two non-interacting atoms turns out to be six-fold 
degenerate. It is described by unperturbed functions of the form (1)W,,,(2) 
and (2)W,,,(1). The matrix elements of the interaction operator H 


a, d,-d, 3(d;:R)(d2:R) 
HE et 

R3 R5 
with respect to these wave functions are not equal to zero for transitions 
between states differing by excitation transference. The calculations are con- 
veniently carried out in a system of coordinates with the z-axis direct along 
the vector R. Solving a secular equation of the form (54.4), we find the 
expression for the interaction energy 





2 2 
=a% Uy E (80.1) 
where g denotes a matrix element of the form 


g=fvid,edV. 


The upper signs in formulae (80.1) refer to symmetric, and the lower signs to 
antisymmetric, excitations states. Energies U, correspond to states with 
A= 1, while energies U, correspond to states with A= 0. 

In gaseous systems in which there is a considerable concentration of 
excited atoms the dipole—dipole interaction with resonant transfer of excita- 
tion can play a more important role than the van der Waals interaction. It 


§81 MOLECULAR VERSUS ATOMIC TERMS 337 


does not disappear in correct averaging over the orientations of the dipole 
moment of the atom and gives a basic contribution to the thermodynamic 
functions of the system*. The resonant dipole—dipole interaction is not 
additive. 


§81. The comparison of molecular terms with atomic terms 


The states of a molecule formed from two atoms can be related to the 
states of the atoms if the process of formation of the molecule is imagined 
as a result of their infinitely slow approach to each other. 

The angular momentum component along the axis joining the two nuclei 
is conserved in the course of the process. On the other hand, as we have seen 
before, the component of the total angular momentum A along this axis will 
also be conserved for this molecule (see §78). We shall determine the possible 
values of A, as well as the number of energy states of the molecule formed. 

Let the atoms be characterized by total angular momenta L, and L3, 
respectively. We assume that L} >L}. The components of the angular 
momenta of the atoms can take on, respectively, the following values: 


I ky Slog eed Leh 0 Oh ag E 
M =L, 5 Ly-1 5 L-2 5 ono ¢) -L3 5 


In accordance with the definition of the quantity A (see §78), there 
corresponds to the maximum value A= L} +L, the only state in which the 
components of the angular momenta of the atoms are equal to M, =L], 
M, = L3. The next possible value of A is equal to A=L,+L—1. To this 
value of A there correspond two terms arising respectively from two states; 
in the first M, =L], M,=L—1, and the second M} =L,—1, M3 = L3. 
Analogously, to the value A=L,+L,—2 there correspond 3 terms arising 
from the states: M} =L,, Mz =L,—2;M, =L,—1,M, =L2—-1;M, =L,-2, 
M, = L3. The results obtained are conveniently expressed in the table: 


for A=L,+L2 1 term is possible , 
for A=L,;+L2—-—1 2 termsare possible , 
for A=L,;+L2—2 3 terms are possible , 


for A=L,—L2 2L2 +1 terms are possible . 


* V.I.Malnev and S.1.Pekar, Soviet Physics JETP 24 (1967) 1220; 31 (1970) 597; 
Yu.A.Vdovin, Soviet Physics JETP 27 (1968) 242. 








338 THEORY OF DIATOMIC MOLECULES Ch. 10 


We can see by a simple calculation that the number of terms for A<L,—L, 
is equal to 2L5+1 and does not depend on A. In determining all possible states 
of the system account must be taken of the fact that each energy level with 
A #0 is degenerate, since the energy of the system cannot depend on the 
orientation of the angular momentum in space. 

The £ term requires particular consideration. 

A molecule turns out to be in the Ð state if M} = —M). This condition is 
fulfilled in L3 cases where we have for the angular momentum components 
M, >0 and M, <0 and also in L3 cases where M} < O and M, > 0. Further- 
more, M, and M, can be equal to zero. Consequently, in the & state the 
molecule can also be formed from 2L3+1 energy states. 

In §78 we have pointed out that © terms are divided into ©* and X7 
terms, depending on the symmetry properties of the system. The symmetry 
properties of the system do not change when the atoms are put an infinite 
distance apart. Hence the wave functions of the system for the states |M} | = 
|M>| can be written in the form of symmetric or antisymmetric combinations 


v= VIP, + vv? , (81.1) 
2 
Ya = VN VG — YSU? - (81.2) 


The È state corresponding to the values M} = M, = 0 is determined by the 
behaviour of the function y = yu? under reflection in a plane joining 
the nuclei of the atoms. Depending on the actual properties of the wave 
functions yo and vy? there arise D* or X- terms. Thus in Ly cases a 
molecule in the &* state is formed, while in another L cases a molecule 
in the X- state is formed. One more Y* or X- term arises depending on the 
form of the function vPy®. 

So far we have considered molecules formed from two different atoms. 
If the molecule js made up of identical atoms, then the calculation of its 
possible states is somewhat modified. Two cases are possible; either the 
separated atoms are in different states or they are in identical states. In the 
first case the number of possible terms must be doubled in comparison with 
the number of terms of a molecule consisting of different atoms, since the 
state of a molecule made up of identical atoms is invariant under the inversion 
transformation, and even and odd terms can be formed. If the atoms are in 
identical states, then the total number of states remains the same as for a 
molecule with different atoms. The problem of the parity of these states is 
rather complex*. 


* See L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965) or E.Wigner and E.Witmer, Z.f.Phys. 51 (1958) 859. 


§82 ROTATION AND VIBRATION 339 
§82. Rotation and vibration of diatomic molecules 


We can now turn to the quantitative consideration of the motion of the 
nuclei in diatomic molecules. We shall not be interested in the translational 
motion of the molecule as a whole. 

The motion of the nuclei in a molecule depends only on the distance 
between the nuclei. In the adiabatic approximation, according to eq. (78.3), 
the wave function satisfies the Schrödinger equation, which in spherical 
coordinates has the form 


R21 8 ( 5 an) A2R2] | 
Eik R- aR MEL IRE an =0, (82.1) 





where £,,(R) is the electron energy, and K is the angular momentum operator 
of the motion of the nuclei. We assume the electron energy to be fixed and 
consider the nuclei for a given £,,. Then the motion of the nuclei amounts to 
rotations and vibrations about an equilibrium position. The angular momen- 
tum operator of the nuclei must be expressed in terms of the total angular 
momentum operator of the molecule, which can be written in the form 
J=K+L, where L is the angular momentum of the system of electrons. 

It is clear that the molecule is in a state with a definite value of the total 
angular momentum. The angular momentum of the nuclei can run over a 
sequence of values corresponding to different rotational states of the elec- 
trons. Hence we shall be interested only in the mean value of the quantity 





È) = G-L)? = J? + (LZ)? -23-L 
since J2 has a definite value and is conserved. 

According to what was said in §78, the component of the angular momen- 
tum of the electrons along the axis of the molecule, L, = A, is conserved. 
__ The two other components are on the average equal to zero: Ly =0, 
L.,=0. Hence we have L? = A2. Furthermore, since the direction of the 
vector n (the axis of the molecule) is the only specified direction, the follow- 
ing equality holds: 


L=nA. 


The vector of the mean angular momentum of the nuclei in a diatomic mole- 
cule is perpendicular to n, i.e. 


K-n=(J-L)-n=0. 


(This statement follows directly from the fact, known from classical mecha- 





340 THEORY OF DIATOMIC MOLECULES Ch. 10 


nics, that the angular momentum vector in a two-body system is perpendicular 
to the axis joining to two bodies.) Hence it follows that 


J-n=L-n=A. (82.2) 
From (82.2) we obtain 

G-L)=A2. 
Finally, we find 

È) = 3? +L? 202 = 441) + 12-202. (82.3) 


where the quantum number J runs over a sequence of integers J >A. 
In formula (82.3) the last two terms depend only on the state of the 
system of electrons, whereas the first term characterizes the rotation of the 
molecule as a whole. 
The Schrödinger equation assumes the form 





2 207. 2 2 2 
CT i) R+ (L°-2A)* , h°IUt1) ap =0. (82.4) 
2u R2 OR oR uR? 2uR? 


Denoting 


U(R) = E,,(R) + (82.5) 


n2(L2_-2A2) 
2uR2 
we see that U(R) plays the role of an effective potential energy. We shall 
consider those states of the nuclei for which the distance between the nuclei 
remains close to the equilibrium distance. 
The effective potential energy can be written in the form 
d? U| 


U(R) = U(Ro)+=—|  }(R-R0)? = U(Ry) + 1uwR(R-R0)? , 
dR?IR=Ro 





where wọ is the frequency of vibration. Eq. (82.4) finally assumes the form 
A2 a ə h2 
=E (on & luwe (R-Rn)2 + ——— = 
i 2u ƏR (z aR ) U(Ro) + 3Hwo(R—-Ro)? + UR? sas1)] a=0. (82.6) 
We see that in the adiabatic approximation used, the motion of the molecule 


for a given electron state amounts to its rotation as a whole together with 


harmonic vibrations. 
The total energy of the molecule is given by the formula 


h2 
E=F!+ BTR? J(UJ+1) + ftwo(vt4) , (82.7) 
0 


§82 ROTATION AND VIBRATION 341 


where v is the vibrational quantum number. 

Let us also estimate the degree of accuracy of the adiabatic approximation. 
According to (57.6) the parameter of non-adiabaticity is in order of mag- 
nitude 


as h2 
Ĉa~T Va fo" Vydr. (82.8) 


We are interested in the dependence of this expression on the reduced mass 4. 
We estimate in order of magnitude the derivative Va in the vibrational ground 
state. Evidently, Va~a[(R—Ro)2]>. In order to estimate the mean dis- 
placement [(R-R9)?]? we note that in the ground state the mean potential 
energy is equal to one half of the total energy, i.e. 


1ucoB(R—-Ro)- = iħwg . 
Hence [(R-R0)?]? ~(wgu)7!. Thus V a ~ (wou)? a. But by definition 
wo = [u-1(d?U/dR?) peg, |? ~ H` - 
The integral in (82.8) depends only on the electron part of the system and 


does not depend on u. Hence, finally, C~ ui. From dimensionality con- 
siderations it follows that 


C~(m/u)* . (82.9) 


Indeed, the total Schrödinger equation describing the motion of all particles 
in the molecule involves only two quantities of the dimensionality of mass; u 
and the electron mass m. One cannot construct any other quantities of the 
dimensionality of mass from the quantities involved in the Schrodinger equa- 
tion. Thus the parameter C is very small even for the hydrogen molecule. The 
quantity n/w) is the basic small parameter of the theory of molecules. 

The spacings between electron energy levels AE‘! for molecules do not 
depend on the mass of the nuclei and are of the same order of magnitude as 
for atoms (i.e. are of the order of a few eV). 

The spacing between vibrational levels is 


AEY® = hwg ~ ur? AE. (82.10) 
This amounts to a few tenths of an electronvolt. 
Finally, the spacing between rotational levels is 


AE" = tiis (U'(J'+1) —JU+1))] (82.11) 
2uR2 


where AJ=J'—-J=+ż]. 











342 THEORY OF DIATOMIC MOLECULES Ch. 10 


Since AE? ~ y-1, this spacing is much smaller than that between vibrational 
levels and amounts to few meV. 

Knowing the distribution of levels, one can find the emission (or absorp- 
tion) spectra of molecules, which differ strongly in character from atomic 
line spectra. However, one then has to take into account an important fact. 
In ch. 12, devoted to radiation theory, it will be shown that transitions be- 
tween levels are limited by so-called selection rules. It turns out that transi- 
tions are possible only between levels for which.a change in the quantum 
numbers defined by the conditions 


J+1 
Wee E 
df =f v =vil 
x 

J-1 


takes place (except for the transition J'=J= 0, which is forbidden). 

Taking into account the selection rules, the proof of which will be given 
in ch. 12, it is possible to find the frequencies emitted or absorbed. In the 
case of transitions for which the electron state of the molecule does not 
change, we obtain (82.7) 


hw =hw,(v'—v") + BU"(J'+1)-J"U"+1)] , (82.12) 
where B=f/2uR2. Taking into account the selection rules, we obtain two 


frequency branches for a given difference v’ —v'’. For J" =J' + 1 we have 
the first branch of frequencies: 


hw, =hw,(v'—v") — 2B '+1) , (OWI 2.a ar (82218) 
For J” =J' — 1 we find the second branch of frequencies: 
Ronee O D By! J" ="1,2;3,4, 2. (82.14) 


We note that J’ cannot be equal to zero, since this would correspond to 
J" =-1. 

Let us consider the order and distribution of these frequencies for a given 
difference v'—v”. The frequency w) decreases beginning with w; = w,,(v'— 
v'')—2B, and the frequency w increases from the lowest value equal to 
wp(v'—v")+2B. The spacing between the lines in each branch is equal to 28, 
while the spacing between the branches is equal to 4B. The frequency 
w,,(v'—v"') lying between the bands is not observed. The set of lines œ} is 
also called the P-branch of frequencies, and the set of frequencies w is the 
R-branch. These frequencies lie in the infrared part of the spectrum. 

We now turn to the study of frequencies which arise in transitions asso- 


§82 ROTATION AND VIBRATION 343 


ciated with a change in the electron state. The character of such a spectrum 
differs fundamentally from the infrared spectrum considered. Radiation fre- 
quencies are defined in this case by the formula 


hw = Eg + hier, (v'+3) —hw,,(v"+3) + B,VJ'+1)J'— Bp I"I"). (82.15) 


Here it should be stressed that w, #w,,, and B, # Bm- Asa matter of fact, 
the vibration frequencies w and the quantities B are determined by the 
electron state of the molecule and, consequently, when this state changes 
these quantities change substantially. 

Since the change in the energy of the molecule in transitions associated 
with a change in the electron state is rather large, the frequencies observed in 
this case lie in the visible part of the spectrum. The set of lines corresponding 
to the chosen pair of quantum numbers v; and vy’ is called a band. The band 
is in its turn made up of three branches. These branches are obtained in the 
following way. In accordance with the selection rules the quantum number 
J" can be equal to J” =J'—1, J” =J'+1 and J” =J'. To the first case there 
corresponds the R-branch, whose frequencies are defined by the relation 


w =A +B Z +C, (82.16) 
where 


A = Eo + he, (v'+}) —he, (v''+}) , 


B=B,—Bm » C=B, +B 


m- 


Transitions J” = J'+1 constitute the Q-branch, and frequencies are in this 
case defined by the formula 


fw =A + BJ'(J'+1). (82.17) 
and, finally, for the P-branch we obtain 
hw, =A — 2B + BJ’? +(B,—3B,,)J" (82.18) 


In all three cases w is a quadratic function of the quantum number J. To 
examine the distribution of frequencies it is convenient to refer to the dia- 
gram of fig. V.24. Here a parabola corresponding to eq. (82.18) is shown for 
B>O. The quantum number J is plotted on the vertical axis, and the fre- 
quency w on the horizontal axis. Experimentally observed frequencies can 
easily be obtained by means of this diagram. If the points of intersection of 
horizontal lines drawn through integer values of J’ and the parabola are 
projected on the horizontal axis, then we obtain the observed values of 
frequencies. These radiation frequencies are shown below the horizontal axis. 








344 THEORY OF DIATOMIC MOLECULES Ch. 10 





























Fig. V.24 


We see that the frequencies of the spectrum are not equally spaced, as was the 
case in the infrared part of the spectrum, but become denser in a certain part 
of it; as w increases the spacings between the observed frequencies become 
larger. The place of convergence of the lines is called the band edge. In the 
case given the band edge lies on the low frequency side of the spectrum. If 
Ba <By, then the parabola is curved in the opposite way. In this case the 
band edge lies on the high frequency side. However, it should be stressed 
that many exceptions to the above rule are observed. For example, in the 
case where electron terms are degenerate more than three bands are observed; 
sometimes, for example, the Q-branch is not observed and so on*. 


* For a detailed exposition of these problems see L.D.Landau and E.M.Lifshitz, 
Quantum mechanics (Pergamon Press, Oxford, 1965). 


11 





Scattering Theory 


§83. Scattering amplitude and cross section 


We mean by a scattering process the deflection of particles from their 
initial directions of motion caused by their interaction with a system which 
we shall call the scatterer. 

The investigation of scattering processes of charged and neutral particles 
is one of the basic experimental methods of studying the structure of atoms, 
atomic nuclei and elementary particles. 

Indeed, the very existence of the atomic nucleus was established in 
Rutherford’s experiments on the scattering of a-particles. The analysis of the 
results of experiments on the scattering of neutrons by nuclei enabled Bohr 
to formulate modern concepts on the structure of the nucleus. The study of 
the laws of scattering of fast particles is the main source of information about 
nuclear forces and of the properties of elementary particles. 

From these examples, although they are far from being complete, it is 
easy to evaluate the significance of scattering theory, one of the most im- 
portant branches of quantum mechanics. 

The scattering of a flux of particles is characterized by the differential 
scattering cross section. This quantity is defined as the ratio of the number 
of particles dN çat scattered per unit time into the solid angle dQ to the flux 


density of the incident particles, Jinco i.e. the differential cross section is 
defined by the relation 


345 








346 SCATTERING THEORY Ch. 11 


WN atl 9) 
linc 
where the angles @ and y define the direction of motion of the scattered 
particles. The z-axis is taken along the direction of motion of the incident 


particles. 
For our purposes it is convenient to write dN at in the form 


dN catl8 9) z Tscat(O.) ds , 


where j,.a; is the flux density of the scattered particles at large distances 
from the scattering centre, and ds is an element of area perpendicular to the 
radius vector drawn from the scattering centre at the angles 0, y. The quantity 
ds is related to the solid angle element d&2 by the equality 


ds=r2dQ. 
Thus the differential cross section is defined by the formula 
=i i, (83.1) 
Jinc 


do = 

This definition of the cross section is the same as that introduced in §43 
of Part I. In quantum mechanics jscat and jinc are probability current densities 
defined in §7. 

In the mutual scattering of two quantum-mechanical systems, for example 
the scattering of an electron by an atom, a neutron by a nucleus, an atom by 
an atom and so on, one has to distinguish between elastic and inelastic scatter- 
ing. In elastic scattering the internal state of both the scatterer and the 
scattered system remains unchanged. For example, in the elastic scattering of 
electrons by atoms the state of the latter remains unchanged. In inelastic 
scattering the internal state of one or both systems changes. For example, 
the scattering of electrons by atoms is inelastic if in the process of scattering 
the atoms make a transition into an excited state. 

In inelastic scattering a part of the kinetic energy goes over into the inter- 
nal energy or, conversely, the internal energy goes over into kinetic energy. 
Collisions of this latter type are called collisions of the second kind. 

We shall begin the exposition of scattering theory with the simpler case of 
elastic scattering. In this case one need not be concerned with the internal 
state of the systems and can call the interacting systems particles (although 
these systems may have a complex internal structure, for example they may 
be atoms, molecules or nuclei). 

In the scattering process there is an interaction between two particles; the 
scattered particle and the scatterer. In this case the potential energy of the 


do(0 p) = 





§83 SCATTERING AMPLITUDE AND CROSS SECTION 347 


interaction very often has the form U(r). Then the problem of the motion 
of two interacting particles can always be reduced to the study of the motion 
of one particle (with reduced mass y) in the field of the motionless centre of 
force at the centre of mass of the system. 

In practice it is always necessary to know how the process appears in the 
laboratory system of coordinates. Therefore if the problem of motion of 
one particle in the field of external forces is solved, then in the final formulae 
one has to transform to the laboratory system. This can easily be done, 
knowing that the cross section (83.1) is invariant under transformation from 
one Galilean system to another, and that the angles transform by means of 
the relations (see §43 of Part 1) 


my, sin 








aS my, + m, cos0 ’ (83.2) 

Here @ is the scattering angle of the two particles in the centre-of-mass 
system; 0} and @ are the scattering angles of the first and second particles 
in the laboratory system of coordinates, in which the second particle was at 
rest before the collision. 

We now consider the wave function of the particle scattered by the force 
centre. We shall not, as yet, make any assumptions about the concrete form 
of the potential energy of interaction. 

We let the motionless scattering centre be at the origin and take the 
direction of the incident particle flux to be the z-axis. At a distance from 
the scattering centre the incident particle moves as a free particle, and its 
wave function has the form of a plane wave e'42. Near the force centre the 
particle undergoes scattering and the form of its wave function is different. 

However, when the scattered particle goes sufficiently far from the force 
centre, it will again move as a free particle. Since the scattered particle flux 
at a large distance will always be directed from the scattering centre, the 
motion of the scattered particles must be described by a diverging wave 
Osp) e7 r. 

The total wave function describing the motion of the incident and scat- 
tered particles at large distances from the scattering centre can be written in 
the form 


A ikr 
y = clk? fO p), (83.3) 
where the first term describes the motion of the incident particles, and-the 


second term the motion of the scattered particles. 
The amplitude of the diverging wave f (0p), called the scattering amplitude, 








348 SCATTERING THEORY Ch. 11 


depends, generally speaking, on the angles @ and y. According to (83.1), one 
has to calculate the incident and scattered particle flux densities. In accord 
with formula (7.6), the flux density in the plane wave eÌ¥Z incident on the 
scattering centre is equal to p/m =v, where v is the velocity of the particles. 
Analogously, the flux density in the diverging wave is given by the expression 


2 
PACE ee (83.4) 
r2 
Taking the ratio of the incident and scattered fluxes we obtain, in correspon- 
dence with formula (83.1), the differential cross section 


do = |f (0,9)? dQ. (83.5) 


Thus we see that the cross section is completely determined by the value 
of the scattering amplitude. The calculation of the latter is usually carried out 
in the following way. One finds the solution of the Schrödinger equation for 
the motion of the particle in the field of the scattering centre, which at large 
distances from the centre has the form (83.3). Then the coefficient of the 
factor ełk” /r gives the scattering amplitude to be determined. 

The wave function describing the motion of the particle at a distance from 
the scattering centre was written in the form (83.3), i.e. as the sum of incident 
and diverging waves, on the basis of simple and obvious physical considera- 
tions. 

However, it can be shown rigorously that at a large distance from the 
fixed scattering centre U(r) the solution of the Schrodinger equation has 
indeed the form (83.3). For this we write the Schrödinger equation in the 
form 


(V2+k2)y = 2DU y f 
A2 


where k? = 2mE/h2, and m and E are respectively the mass and energy of the 
scattered particle. By means of a Green’s function the solution can (see §15) 
be written in the form 


V=vorf cer) UGYE’) dV" , (83.6) 
where the function Yọ satisfies the equation 
(V2+k2)) =0. 
The solution of this equation evidently has the form of a plane wave eikz, 
The Green’s function satisfies the equation 


§83 SCATTERING AMPLITUDE AND CROSS SECTION 349 
(V2+k2)G(r,r') = 6(r—r') . 


This equation is formally identical with eq. (24.20) of Part I, if w/c? in 
it is replaced by k? and —4njo/e by 6(r—r’). Without reproducing the cal- 
culations of §24 of Part I, we make use of a formula similar to eq. (24.19) of 
Part I and write the solution for G(r,r’) in the form 


r”—r') ekir-r"i ay” 
PaT ; 





nai fe 
Gerz S" 


Carrying out the integration with respect to dV’, we obtain 
l elkir—r'| 
G(r’) = —— r 
Od Sade \r—r'| 
Substituting the values of Wg and G into (83.6), we arrive at the integral 
equation 





m EEA 
2rh2 Ir—=r'| z 
Further, we consider the integral involved in formula (83.7) and determine its 
values at large distances r. We define large distances in the following way. Let 
the range of values of r’ in which the integrand differs considerably from 
zero and which gives the basic contribution to the value of the integral be R. 
We shall call large distances those distances |r| for which the inequality 


Ir|>R (83.8) 


y = elkz (83.7) 


is fulfilled; such distances always exist when U(r) decreases sufficiently 
rapidly. In calculating integral (83.7) at large distances it can be assumed that 
Iri > {r'l. 

Expanding |r—r’| in a series, we have 


, 





Ir—r'| = ((r—r')?]? = (7? -2r-r')? =r — = 
Substituting this expansion into (83.7), we find 
A ikr Fian z 
y= elke T fyg) ekr yr’) a’. (83.9) 


2nh2r 


Here k = kr/r. The wave vector k is evidently directed along the radius vector. 
It characterizes the direction of propagation of the diverging spherical wave. 
Comparing (83.9) with (83.3) we see that this last expression is of a general 
character. The scattering amplitude is equal to 








350 SCATTERING THEORY Ch. 11 


= m A A a 
TOD EN OW. kr gy’, (83.10) 


We shall need formulae (83.9) and (83.10) for what follows. 


§84. The Born approximation 


Although we could find the asymptotic expression for the wave function, 
the problem of obtaining the concrete form of the scattering amplitude is 
still far from being solved. Indeed, according to formula (83.10) the scatter- 
ing amplitude is expressed in terms of.the unknown wave function y. The 
exact solution of the Schrödinger equation and the determination of f(0 p) in 
most problems of practical interest is associated with very great mathematical 
difficulties. Therefore approximate methods are widely used in scattering 
theory. The most important of these is the Born approximation. This method 
is based on the assumption that the potential energy of interaction of the 
scattered particle with the force centre is small, so that it can be considered 
as a small perturbation. 

If the potential energy is a small perturbation, then it can be assumed 
that the initial motion of the particle is changed only slightly. Then the 
integral equation (83.9) can easily be solved by a method of successive 
approximation. In the zero order approximation the small term containing 
the potential energy can be dropped. Then 


Wo = eikz = eiko T , (84.1) 


where ko is a vector equal to kọ = kno, and nọ is the unit vector along the 
z-axis. In the first approximation, in place of the wave function on the right- 
hand side of (83.9), one has to substitute the value of its zero order approxi- 
mation (84.1). We obtain 





i m eikr IW iig 
y = eikz — U(r’) eikz'=ik-r' qy” , 84.2 
2rħ?r J ( ) 
In this approximation the scattering amplitude is equal to 
SOW) = — fug) Kr av’. (84.3) 
2nh2 


Here we have introduced the notation 


K= ko —k. (84.4) 


§84 THE BORN APPROXIMATION 351 





Fig. V.25 


where, in correspondence with fig. V.25, the modulus of the vector K is 
defined by the relation 

K= kin—no| = 2k sin 40 =Z singo. (84.5) 
The vector K is often called the collision vector. Correspondingly the vector 
P=AK is called the momentum transfer vector. If the potential energy does 
not depend on angles, U= U(|r|), then in (84.3) one can carry out the 
integration over angles 


°° T 2r 
s=- | ulr? ar’ f eikr’c0s9 sing ao fJ w= 
2mh* g 0 0 


h2 
_ 2m 1) sin Kr' 15 
=—— rr 4 84.6 
hi u KED a o ( ) 


In the first approximation the scattering amplitude is determined by the 
potential energy to the first power. Then, if (84.3) is substituted into defini- 
tion (83.5), we find 


2 
do = |f(6)|2 dQ = 2 — 
4n2nt 





ea 2 
f uw’) eKr ay d& = 
0 = 72 

2 oo c 1 
-4m ff U(Ir'l) sin’ Ky. rar’ 
6 Kr 


dQ. (84.7) 
nt 








Expression (84.7) is called the Born approximation. It is widely used in 
nuclear physics. 

Continuing with successive approximations, i.e. substituting the wave 
function Y from (84.2) into (83.9), it would be possible fo find the wave 
function and scattering amplitude in the second approximation. The addition 








352 SCATTERING THEORY Ch. 11 


to the scattering amplitude in the second approximation would be determined 
by the integral of the square of the potential energy of interaction. Correc- 
tions of higher orders can be found in an analogous way. 

For small values of the scattering angle we have from (84.7) 


aga f uy? wl as 
o= rI) < 5 
i A 





i.e. the cross section turns out to be independent of the velocity of the 
particle. In the next section we shall give an example of a concrete calculation 
of the cross section by the Born approximation. We now turn to a discussion 
of the applicability of the Born approximation. 

For a rapid convergence of the series of successive approximations it is 
necessary that the correction to the wave function of the first approximation 
Y} be small in comparison with the wave function of the zero order approxi- 
mation Wo, i.e. the following condition must be fulfilled: 


Wil <IWol - (84.8) 
By means of (83.7) one can find the value of the function Y} which is 
valid for arbitrary values of r; then (84.8) is written in the form 
jea eikir-r'l U(r’) dV’ 
N ee 
Ir—r | 


m 

2nh? 

‘ Since yr’) decreases with distance from the scattering centre, then con- 

dition (84.9) will be fulfilled if it is fulfilled at the origin. Hence condition 
(84.9) can be replaced by the inequality 

(jee UCr') dV" 


r 


<1. (84.9) 








[<1 : (84.10) 





om 
HOST 


Further estimates of the integral can be carried out in two limiting cases: 
1. When the relation KR <1 is satisfied, where R is the effective range of 
interaction. This corresponds to small energies of the particles 


2. When the inverse inequality kR > 1 is fulfilled. This corresponds to the 
condition 

n2 

mR? 





§84 THE BORN APPROXIMATION 353 


In the first case, in estimating the integral one can set e!(”"*2') in (84.10) 
equal to unity. 
Then (84.10) gives in order of magnitude 


m c|U(r')[ dV _ m 

E PLONE E E a 

he i" he 

Here Up is a mean value of the interaction energy in the range R. 
We write this last relation in the form 


hi 

=a (84.11) 
mR? 
According to (37.9), the expression #2/mR2 is equal in order of magnitude 
to the minimum depth of the potential well of radius R for which the level 
arises. We see that the condition of applicability of the Born approximation 
for the scattering of slow particles has a simple meaning. Namely, the mean 
interaction energy must be small in comparison with the minimum potential 
energy of the particle in the well for which the bound state is formed. 

In the case of a large energy of the particle the region of applicability of 
the Born approximation is considerably enlarged. The exponential factor in 
formula (84.10) oscillates very rapidly, leading to a decrease in the total value 
of the integral. 

In calculating the integral one can take the slowly varying factors out, 
writing 





Up < 














m elkz’ ęikr' g 2 
p, (0) = U, | 2—7] 
TO ol f 
m\Ual RAES meee pai 
SO J ff elkr'(1+c08 9) sin 9 ddr’ dr'|= 
h2 6 
m| Ugl R nE m\UgiR 
nk J TOE E 





Here we have dropped the integral of the rapidly oscillating quantity e2ikr' 
since it is small in comparison with the integral retained. 


Rewriting this last inequality in the form 
[UpIR 


i 
w (84.12) 


p 


we see that the Born approximation becomes valid for particles with larger 





354 SCATTERING THEORY Ch. 11 


energy the larger the product UR determined by the properties of the 
scattering centre. 

In the important case of the Coulomb field the potential Ze2/r decreases 
so slowly that the concept of an effective range of interaction R cannot be 
introduced. 

However, we note that for Ug = Ze?/R the product UR contained in the 
inequality (84.12) does not depend on R. 

Hence for the Coulomb field inequality (84.12) assumes the form 


Ze? 
Ao L (84.13) 
This has an obvious meaning: if the velocity of the electron in the first 
Bohr orbit of a hydrogen-like atom with nuclear charge Ze (the quantity 
= Ze? fħ) is introduced, then formula (84.13) assumes the form 


V, 
S (84.14) 


i.e. the velocity of the particle must be large in comparison with the velocity 
of the electron in the first Bohr orbit. 

For the Born approximation to be applicable inequality (84. 13) requires 
a larger energy the larger the charge of the scattering nucleus. 


§85. The scattering of fast charged particles by atoms 


Let us apply the Born approximation to the calculation of the cross 
section for the scattering of fast charged particles by atoms. 

We shall assume that the nucleus of the atom with charge Ze is at the 
origin, and that the charge of the atomic shell is distributed in space with 
density n(r). We shall disregard the size of the nucleus treating it as a point. 

The differential scattering cross section is given by formula (84.6), which 
for U=ey, where yis the potential of the electric field acting on the particle 
to be scattered and e is its charge, assumes the form 

_ me? 


ia fee eke arl 


The integral in formula (85.1) is conveniently expressed in terms of the 
charge density distribution in the atom. 

For this we note that f y(r’) eK’ dV’ represents the Fourier component 
of the potential. It can be expressed in terms of the Fourier component of 


dQ. (85.1) 











§85 SCATTERING OF FAST CHARGED PARTICLES BY ATOMS 355 


the charge density analogously to formula (24.25) of Part I which relates the 
Fourier component of the current density to the Fourier component of the 
potential. 

We then have 


_ 4m?e? e2 
Kant 


The charge density in the atom can be written in the form 


2 i 2 
foa eK av dQ. (85.2 





p(r) = Zed(r) — en(r) . (85.3) 


For the differential cross section we finally obtain 








_ 4m?e4 et 
Z iKr (r)a — fne’) KT ay’ AE 
nAKS J if 
_ 4m?e4 et rid z 
A IZ - F(K)? dQ, (85.4) 
where 
FK) = fn(r') iK r av. (85.5) 


The quantity F is called the atomic form factor. Its value is determined by the 
electron charge density distribution. 

Substituting into (85.5) the value of the collision vector K according to 
(84.5), we rewrite the differential cross section in the form 


dQ 
sin4 40` 





2 2 
a= ( e :) IZ- FKP? 


(85.6) 
2mv 


Let us first consider a particular case of formula (85.6). If the scattering 
takes place on a point nucleus without an electron shell, 2 = 0, then, conse- 
quently, F = 0. We then obtain for the differential cross section 


2\2 
do = ( Ze ) ce (85.7) 
2mu?/ sint 40 


This is the well-known Rutherford formula, which is obtained in classical 
mechanics. The Rutherford formula in the case given was obtained by means 
of the approximate Born method. However, it is interesting to note that the 
same expression is obtained in an exact solution of the problem*. Since the 








*See, for example, N.F.Mott and H.S.W.Massey, The theory of atomic collisions 
(Clarendon Press, Oxford, 1965). 





356 SCATTERING THEORY Ch, 11 


scattering cross section in the exact solution does not contain Planck’s con- 
stant A, the results given by classical and quantum physics must naturally be 
the same. 

The fact that the cross section becomes infinity for scattering at infinitely 
small angles is associated with the slow change of the Coulomb potential. 
Hence particles are scattered no matter how far away they pass from the 
scattering centre. However, as we shall see later, in practice the screening 
effect of the electron shell ensures a finite value of the scattering cross 
section. 

Let us now consider the atomic form factor (85.5). The effective range of 
integration in it has a size of the order of the atomic size a. Outside this 
range n(r) reduces to zero. Hence for small angles 0, for which Ka < 1, the 
exponent in integral (85.5) can be expanded in a series. We then have 


Z-FK)=Z-Z-iK fn(r'yr' dV’ + s nŒ) (K: r’)2av'. (85.8) 


In formula (85.8) the first two terms mutually cancel, since the charge of 
the electron shell of the atom is equal to the charge of the nucleus. The third 
term represents the dipole moment of the atom, which, as we have seen (see 
§72), is equal to zero. In the last term, integrating over angles we obtain 





mK? f 
Z-F=7 f n(irl) r4ar. 
2 0 


The differential cross section in the limiting case Ka < 1 will have the form 


do = (Ey [o ráar ag $ 
32 

Thus owing to the screening by the charge of the electron shell the differ- 
ential cross section for small scattering angles turns out to be a finite and 
constant (independent of angle) quantity. On the contrary, for large scatter- 
ing angles, when the inverse inequality Ka > 1 is fulfilled, the exponential in 
integral (85.5) begins to oscillate rapidly and the form factor turns out to be 
small. Neglecting it in comparison with Z, we arrive at (85.7). The screening 
of the nuclear charge is not manifested for large scattering angles. 

As an example let us calculate the form factor for the hydrogen atom. 
According to §38, the charge density in the hydrogen atom in the ground 
state is equal to 


§86 PARTIAL WAVE SCATTERING 357 


2 
n(r) = |W)? = 1 e-2rfa , a= ne 
na? 


me2 ; 
Consequently, the form factor is defined by the integral 
F(K)= L f'e-2rla eiKr 2 dy sin 9 dd dy. (85.9) 
na? 


Directing the z-axis along the vector K, we have 
1 iK : 
F(K) = = S eKr cos 8 +2 dr sin 9 dY dy . 
ma 


Carrying out the integration, we finally find 


16 
(4+K?a2)2 


Then the differential cross section for the hydrogen atom can be written 


in the form 
£ ( e2? j [ 16 i dQ 
do = 1 à 
2mv2 (4+K2a2)?] sint 40 


The total cross section is obtained by integration over all values of the 
scattering angle. 

For other atoms of the periodic system of elements the charge density and 
the potential of interaction with the scattered particle can be calculated by 
means of Hartree or Thomas—Fermi approximate methods. After this the 
calculation of the form factor can be carried out in accordance with formula 
(85.5). 


F(K)= 





§86. Partial wave scattering theory 


In the preceding sections we have considered one form of approximate 
scattering theory. 

In addition to approximate theories it is possible to develop an exact 
scattering theory, which is often called partial wave theory. 

The general scheme of partial wave scattering theory does not differ from 
that assumed in §83. We consider the motion of a particle in the field of a 
scattering centre. We assume that the scattering field is spherically symmetric 
and that at a distance from the centre the incident particle is described by the 
plane wave elkz and the scattered particle by a diverging spherical wave. Let 








358 SCATTERING THEORY Ch. 11 


the general solution of the Schrodinger equation in the centrally symmetric 
field be found. At a distance from the scattering centre this solution must be 
written in the form (83.3), i.e. in the form of an incident plane wave and a 
diverging spherical wave. As we know, the amplitude of the latter determines 
the scattering cross section which is of interest to us. 

According to (35.31), the general solution of the Schrodinger equation in 
a centrally symmetric field independent of the angle y can be represented by 
the expansion 


Y= 27 A)R((r) P(cos 6) . (86.1) 
1=0 


We shall call a given term of the series (86.1) the /th partial wave. At a large 
distance from the force centre the asymptotic form of the radial functions R} 
is given by formulae (35.26) and (35.27) 


sin (kr+8,—371) 
a PaaS i 





exp i(kr+6,—3 nl) — exp —i(kr+6 )—}11) 


Sa Zikr (662) 


We recall (see §36) that if the potential energy U(r) is equal to zero over 
all space, then the set of phase shifts 6, reduces to zero. The asymptotic 
expression which we need for Y for the motion of a particle in the potential 
field U(r) can be written in the following form: 


oo 
exp i(kr+8;—4 nl) — exp —i(kr+8;—4 nl) 
Y= DC, P(cos0) = ~ 
r 


1-0 


(86.3) 


We now have to write expression (83.3) in the form (86.3). This will allow 
us to relate the coefficients C; and the phase shifts 5, to the scattering 
amplitude f(@). Expression (83.3) is most simply brought into the form 
(86.3) by expanding (83.3) in a series in terms of Legendre polynomials. We 
need the expansion of the plane wave e'*? only for large distances, which can 
be found very simply. We write the plane wave in the form 


co 
eikz = eikr cos6 = )) il(21+1) P)(cos 0) G(r) , (86.4) 
1=0 


§86 PARTIAL WAVE SCATTERING 359 


where G(r) is an unknown function of the radius. Multiplying this equation 
by P/{cos 0) sin @ and integrating with respect to 0, we find 


+1 
a ikrx = E 
* P(x) dx = G(r). 86. 
T wi eikrx P (x) dx = G(r) (86.5) 


We have made use of the conditions of orthogonality and normalization of 
the Legendre polynomials 


+] 2 


O 
-1 


Integrating the left-hand side of (86.5) by parts, we have 


i =] 
Gr) = Ji HF ene + terms of the order of r72 


Finally, using the known property of the Legendre polynomials P,(1) = 1, 
P(—1)= (—1)!, we obtain for the function G,(r) at large distances 


sin (kr—471) 


GOL kr 


Thus the expansion of the plane wave at large distances is written in the 
form 


sin (kr—4 ih 
kr 


eikz = J il(21+1 )P,(cos 0) 
l 


(86.6) 
We also expand f(0) in a series in terms of the Legendre polynomials 
f@)= 22 DP;(cos0) . (86.7) 

l=0 


Substituting series (86.6) and (86.7) into (83.3) and equating the ex- 
pression found and the asymptotic expression (86.3), we have 


360 SCATTERING THEORY Ch. 11 


P;(cos @) 


D C; Eor [exp i(kr—}m1+5,) — exp —i(kr—}71+5,)] = 
l 


(86.8) 


y 2 pea ) [exp i(kr—4nl) — exp —i(kr—ŁnI)] + DÊ EE] picos 0). 


For eq. (86.8) to be fulfilled for arbitrary values of the angle @ it is 
necessary that the coefficients of the polynomials P} on each side be equal to 
each other. Equating these coeffcients we find 


Tae [exp i(kr—4nl+8;) — exp —i(kr—371+5))] = 


i! ikr 
ži Ga) [exp i(kr—Lnl) — exp — i(kr—4n1)] + D; — . (86.9) 


This relation must be fulfilled for any arbitrary value of the radius r. This 
means that the coefficients of exponentials with the same indices must be 
equal to each other. Hence we find the following relation between the coeffi- 
cients: 


= i/(2/+1) exp (i5)) , 
il(21+1) + 2ikD, exp Ġir!) = C; exp (i6)) - (86.10) 
Finding D, from this and substituting it into expansion (86.7), we find for 
the scattering amplitude the expression 


f@)= a) (21+1) [e2i81—1] P,(cos 0). (86.11) 
1=0 


Consequently, the differential cross section will be equal to 


co 

z 2 

werz D (21+1)(e28!-1) P,(cos6)} dQ. (86.12) 
1=0 


We find the total cross section by integrating (86.12) and taking into ac- 
count the orthogonality relations for the Legendre polynomials. A simple 
calculation gives 





§86 PARTIAL WAVE SCATTERING 361 


co 
o= 2O $1 (211) sin? 6; . (86.13) 
1=0 k* 


We see that the differential cross section and the total cross section for the 
scattering of a particle in a given force field is expressed in terms of the set of 
phase shifts 6). Hence it follows that for the calculation of scattering cross 
sections it is necessary to find the solution of the Schrödinger equation (35.8) 
for a particle moving in the given force field. Defining the form of the solu- 
tion for large distances and comparing it with (86.2), we find ôq- 

The exact solution of the Schrödinger equation makes it possible to find 
the infinite set of phase shifts 5, and, consequently, the value of the scattering 
cross section. The exact or partial wave theory of scattering was first de- 
veloped by Rayleigh, who studied the scattering of sound waves. Faxen and 
Holtsmark were the-first to use Rayleigh’s method for solving the problems of 
quantum mechanics. 

From (86.13) it is seen that the total cross section can be written in the 
form of a sum of the so-called partial cross sections 


o= Dio, == I+) sin? 6). 
1=0 k 


Each of the partial cross sections corresponds to taking into account one of 
the terms of the series (86.2) 
sin (kr—4nl+ô;) 


B Pi(cos 6) ja 


It is clear that it describes a state of the particle with definite angular mo- 
mentum L? =/2/(/+1). For this reason a notation analogous to the notation 
of atomic terms is adopted in scattering theory. For example, to /=0 there 
corresponds s-wave scattering, which is characterized by the partial cross 
section gg; to /=1 there corresponds p-wave scattering with the partial cross 
section g} and so on. 

The total particle flux through an arbitrary surface surrounding the scatter- 
ing centre for a particle in the state with angular momentum L, is equal to 
zero. It could be calculated by the general formula (7.3). However, this can 
also be seen without carrying out the calculation, on the basis of the general 
theorem presented in §7. There it was pointed out that the total flux is 
always equal to zero in the case of a real wave function. In our case this is so, 
since the wave function is expressed by formula (86.2). 





362 SCATTERING THEORY Ch. 11 


The equality to zero of the total flux of scattered particles has an obvious 
meaning; it means the law of conservation of the number of particles in the 
process of scattering. It is important to note that the conservation law holds 
for particles of each value of / separately. We shall come back to the dis- 
cussion of this fact in §91. 

Finding the values of all the phase shifts 6, is, as a rule, a very complex 
problem. Furthermore, the practical value of formulae written in the form of 
a series is not great if the series does not have a sufficiently rapid convergence. 
We cannot dwell here on the problems of convergence of the series (86.12) 
and (86.13) and shall give only the final result*. 

For the convergence of series (86.13) it is necessary that the potential 
energy U(r) should decrease at large distances more rapidly than 7”, where 
n> 2. Further, the series for the differential cross section diverges for 6 = 0, 
if U(r) has the form r”, where n <3 at large distances. When r > 0, U(r) 
must increase more slowly than 772. 

The practical value of formulae (86.12) and (86.13) for the scattering 
cross section becomes greater, the smaller the number of terms of the series 
which play an essential role. Simple reasoning shows that as the energy of the 
particle increases the number of phase shifts 6, which must be taken into 
account in series (86.12) and (86.13) increases. 

Indeed, let R be the radius of the sphere in which the interaction energy is 
substantially different from zero. For a sufficiently rapid decrease of U(r) the 
introduction of such a quantity is always possible. The wave function R; has 
its first maximum at the distance r defined by the relation kr ~/. At the next 
maximum R; has a considerably smaller value because of the decrease of the 
factor r}. 

For small values of r the wave function is also small. Thus the wave func- 
tion R; has its basic value for z ~ I/k. If r ~ l/k >R, then the wave function 
is small in the interaction sphere. But in this case the scattering amplitude 
will also be small. Thus only those particles for which //k <R undergo 
effective scattering. 

The angular momentum / of the effectively scattered particles increases 
with increasing energy of the particle. For small energies the number of 
terms which must be taken into account in the series (86.12) and (86.13) is 
relatively small. Therefore partial wave scattering theory is particularly im- 
portant for the study of the scattering of slow particles. This qualitative 
reasoning can be replaced by a quantitative rule, which we shall give without 


proof. 


* L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 1965). 


§86 PARTIAL WAVE SCATTERING 363 


If a classical particle having momentum p and impact parameter 


-AUDE _ UD] 
a T 





Pi (86.14) 
in moving does not penetrate the region where the potential energy of inter- 
action differs considerably from zero, then the phase ô; corresponding to the 
angular momentum #i2/(/+1) is small*. 

We apply this rule to the investigation of the scattering of a slow particle. 
Let the scattering centre produce a field effective to the range R. By slow 
particles we shall mean particles with quantum number k for which kR <1. 

In this case 


>R. (86.15) 


For all values ¿> 0 all phase shifts, except ôg, are small. We thus see that 
only s-wave scattering is important for the scattering of slow particles. 

The differential cross section is then equal to 

aie 
do = ms |e2!50 — 1/2 dd= Stoo 
4k? k? 
since Po(0)= 1. 

The cross section for s-wave scattering does not depend on the scattering 
angle. This means that the scattering is spherically symmetric. As the energy 
of the particle increases phase shifts of higher order begin to play a role and 
the scattering progressively assumes an ever more asymmetric character. 

For large energies the cross section becomes substantially different from 
zero only for very small angles 0. This can best be seen by means of the Born 
approximation (84.3). For large energies the vector K is large, the integral 
rapidly oscillates, and hence the cross section is small. For 0 = 0 the vector K 
is equal to zero, there is no oscillation, and the cross section is large. 
Finally, we note that partial wave scattering theory in the form in which it 
has been described here is inapplicable to scattering in the Coulomb field. 
The wave function in this case does not have the asymptotic form (83.3). 
This fact is associated with the very slow decrease of the Coulomb potential 
as a function of the distance. This case requires particular consideration**. 





dQ, (86.16) 


* The derivation of this statement is given in the book: N.F.Mott and H.S.W.Massey, 
The theory of atomic collisions (Clarendon Press, Oxford, 1965). 
** For the exact solution of this problem see, for example, L.D.Landau and E.M. 
Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 1965). 





364 SCATTERING THEORY Ch. 11 


§87. Scattering by a spherical potential well (the concept of resonance 
scattering) 


As an example of the use of partial wave theory we shall consider the 
scattering of a particle in a potential field which we define in the following 
way: 


U=—-U, for r<R, 
U=0 for r>R. 


(87.1) 


For simplicity we confine ourselves to the case where the scattered 
particle has a small energy, i.e. KR<1. In this case, as we know, s-wave 
scattering is important, and we need only determine the phase shift ôg. In 
the case of the potential field given by formula (87.1) the solution of the 
problem offers no difficulty. By means of the relations already found we can 
also illustrate a very interesting phenomenon occurring in the scattering 
process, so-called resonance scattering. It consists in the fact that the scat- 
tering cross section under certain conditions turns out to be very large. This 
effect occurs when there exists an energy level in the potential field close to 
zero and the energy of the particle to be scattered is sufficiently small. We 
write the wave function in the form Y = AjRo(r) = x/r. The function x(r) 
satisfies the equation 


d2x Ones 
—++k*-x=0 for r>R, (87.2) 
dr2 


2 
OX + p2x)=0 for r<R, (87.3) 
dr2 


where 
+ 
pe 2m(E+Up) 
h2 
The form of the function x-for r>R is easily obtained from the solution of 
eq. (87.2) 
x = Csin (Art6) - (87.4) 


In the general case the function x has the form (87.4) only for large distances 
(see formula (86.2)). However, in our case, by virtue of the sharp boundary 
of the potential energy, the function Rg has the form (86.2) for all distances 
r>R. 

Forr <R we obtain 


§87 SCATTERING BY A SPHERICAL POTENTIAL WELL 365 


X=A sinfr +B cosfr, 
where A and B are constants. The function Rg must remain finite for r > 0. 
Hence the coefficient B must be set equal to zero. Thus we get 


X=Asinfr for r<R. 


The function y and its first derivative must be continuous at the point 
r=R. These two relations are conveniently replaced by the equality of the 
logarithmic derivatives. We then find 


B cotan BR = k cotan (KR+6 9) . (87.5) 


We have obtained a transcendental equation for the phase shift 69. We first 
assume that ôg is small. Then cotan(KR+5,) can be expanded in a series in 
terms of the small argument KR+6 9. As a result we have 

k 


B cotan BR = KR + 50° 


whence we can find the phase shift 6g which is equal to 


s k 
Soia B cotan BR —kR. (87.6) 
We see from relation (87.6) that the phase shift will indeed be considerably 
smaller than unity if the following relation is fulfilled: 


k 


ENRE Ils (87.7) 


The differential cross section can easily be found by making use of formula 
(86.16) and recalling that 6g < 1. It has the form 


2 
_ 50 L( k 


k2 av= k2 \B cotan BR 





do — kR) dQ. 

However, a form of the potential well for which 6 cotan BR approaches zero 
is possible. In this case inequality (87.7) is violated, and the phase shift 59 is 
large. To find the conditions under which 6 is large, we shall establish the 
relation between the quantity B cotan BR involved in formula (87.5) and the 
energy level of the particle in a bound state. In §37 we have obtained for the 
energy levels of a particle in a potential well formula (37.6) 


2m(Up—€) 3 aes 2me 
= = : 
h2 H2 like 





(87.8) 


eee ee ee te leet 





— ee 


366 SCATTERING THEORY Ch. 11 


e is the energy level of the particle in the well. If the energy level of the 
particle in the well is close to zero, i.e. if e < Up, then relation (87.8) may be 
rewritten in the form 


2mUy 5 (= UpR? j _2me 
co = re 


= 5 Pi (87.9) 








In the case being considered the energy of the scattered particle is also 
small (E<U ), hence relation (87.9) can be written in the form 


g2 cotan? gR = 3E. (87.10) 


Thus we see that the increase of ôg is associated with the presence of an 
energy level e close to zero. 

We now turn to the calculation of the phase shift in the case where rela- 
tion (87.7) is violated and 89 is large. We denote this value of the phase shift 
by 5o,- We find 69, again from relations (87.5). For this we expand 
cotan(KR+6o,) in a series in terms of the small parameter KR and restrict 
ourselves to the zero order term of the expansion. Then we obtain 


B cotan BR = k cotan do, - 


Squaring this relation and making use of formula (87.10), we find 
cotan? Sor= e (87.11) 


We now see that the phase shift 59, will not be a small quantity if e < £. 
We find the scattering cross section by means of the general formula 
(86.16). In this case we have 
4r sin? Sor  2rh2 


: 2 mE: (87.12) 


o 
This expression is called the Wigner formula. It is easily seen that the cross 
section in the case of a resonance is considerably larger than in the absence 
of a resonance. The ratio of the cross section is equal to 


o, sin? So, 


(os 62 
0 
Since 59 <1, and sin6o,, as is seen from formula (87.11), for € ~ £ is close 
to one, then it is evident that 


SE ee a Dann Dd p = - 


§87 SCATTERING BY A SPHERICAL POTENTIAL WELL 367 
g, 

mle 

o 


We have obtained formula (87.12) for a particular form of the potential 
energy. It should be stressed, however, that the dependence of the cross sec- 
tion on e (87.12) is general and is not related to the actual form of the 
potential energy*. 

Resonance scattering also occurs in the case where the system does not 
have a real level close to zero but the configuration of the field is similar to 
that for which such a level appears. In such a situation the function cotan BR 
is positive, whereas for the real level we necessarily have cotan BR <0 (see 
§37). Relation (87.10) contains cotan? BR and hence is fulfilled indepen- 
dently of the sign of the function cotan fR. In the case where cotan BR > 0, 
scattering takes place at the virtual level, not at the real level. 

By means of the relations obtained above one can also easily find the 
differential cross section for scattering by a potential barrier, i.e. by a 
potential field having the following form: 


U=0 for r>R, 
U=|Upl for r<R. 


For this it is sufficient to carry out the replacement B > if. Then for the 
differential cross section we obtain 


do = =a (tanh [IR — |BIR)2 d2 . (87.13) 


Formula (87.13) is simplified in the case of an infinitely high potential 
barrier Ug >œ. In this case we find the following expression for the total 
cross section: 


o=4nR2. (87.14) 


It is interesting to note that the scattering cross section is larger in this 
case than the geometric size of the scatterer, by a factor of four. 


* For a more general derivation of the formula for resonance scattering see L.D. 
Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 1965). 





368 SCATTERING THEORY Ch. 11 
§88. The elastic scattering of identical particles 


Up till now we have assumed that the scattered particle and the target are 
different particles. We now consider the case where the scattered and target 
particles are identical. As we shall now see, the identity of the particles has an 
essential effect on the scattering process. We shall begin with the. considera- 
tion of particles of zero spin. We first suppose that the identical particles 
move towards each other with equal velocities. In this case the centre of 
mass of the system is at rest and the wave function will, in correspondence 
with (14.14), have the form 


y = VOuy,2) 
and will depend only on the relative coordinates. The wave function Wo(x,y,z) 
satisfies eq. (14.11) 


2 
-Z v2+ v0 Y =EWxy,2). (88.1) 


The reduced mass of two identical particles is equal to u =4m. We cannot 
for our case write the wave function in the form 


y= eikz 4 LO) ikr, 
r 


since this function does not satisfy the symmetry requirements. 

As a matter of fact according to (14.6) there corresponds to the exchange 
of two particles (i.e. to the replacement x; > x7,¥, > 2,21, > Z2) the trans- 
formation r> —r. Here the modulus of the vector r does not change, and the 
angle 0 is replaced by m —@. Taking into account this last transformation, it is 
easily found that the symmetrized wave function must have the form 


Yy = ete + emits +7 1706) + ffa) (88.2) 


A diverging wave again describes the scattered particles. The differential 
scattering cross section is now given by the expression 


do = |f(0) + f(n—0)|? dQ = |f(@) + f(n—0)|? sind dO dy. (88.3) 
Thus we have found the differential cross section for the process in which 


one of the colliding identical particles is scattered at an angle @ with respect 


to the direction of its initial flight. 
From formula (88.3) it follows that the number of particles scattered at 
angle 0 and at angle 7—@ is the same. If one of the particles was at rest 


§88 ELASTIC SCATTERING OF IDENTICAL PARTICLES 369 


before the collision, then the differential cross section in this system of 
coordinates can be found in the following way. In the system of coordinates 
in which the centre of mass is at rest the differential cross section is given by 
expression (88.3). The transition to the laboratory system is carried out by 
means of formulae (83.2). In the case given the mass of the particles is the 
same, and we obtain 


sin ð 


amen = 1 + cos 


= tan 40 
and, correspondingly, 3; =40. 
Expressing the differential cross section as a function of the angle |, we 
find 
do = |f(28,) + f(m—29,)I? 4 cos 9; sind, dd, dy, = 
= |f(28,) + f(m—20,)I? 4 cos 9, dQ, , (88.4) 


where dQ} is a solid angle element in the laboratory system. 

Expression (88.4) gives the differential cross section for the process in 
which one of the particles is scattered into the solid angle element dQ}. Since 
the two particles are identical, the question as to which one of the particles 
entered dQ,, that which was initially moving or that which was initially at 
rest, makes no.physical sense. E 

As an example of the application of formula (88.4) let us consider the 
collision of two identical particles in which the interacting energy has the 
simple form 


U=U,) for r<R, 
U=0 for r>R. 


We suppose that before the collision one of the particles was at rest while 
the other was moving sufficiently slowly that the relation KR <1 was ful- 
filled. In this case 6; <1 and in accord with (86.11) the scattering amplitude 
can be written in the form f(@) = 5/k. 

For the differential scattering cross section in the laboratory system we 
obtain 

1652 
do = |f(29,) + f(m—28, )I? 4 cos 9, dQ, = my cos 9 dQ, . 


Thus we see that if in the centre-of-mass system the scattering is spherically 
symmetric, then in the laboratory system the differential cross section is 
proportional to the cosine of the scattering angle. 





370 SCATTERING THEORY Ch. 11 


The theory of the scattering of identical particles with spin different from 
zero is constructed according to the same scheme as for spinless particles. For 
concreteness we assume that both colliding particles have spin 4. The general- 
ization of the theory to the case of arbitrary spin offers no difficulty. 

We consider the collision of two identical particles in the centre-of-mass 
system in the case where the total spin of the system is equal to zero (i.e. 
the spins of the particles are antiparallel). Then the spin part of the wave 
function must be antisymmetric and, consequently, the coordinate part must 
be symmetric. In other words, the coordinate part of the wave function can, 
as in the case of spinless particles, be written in the form 


yp = elle + eitz © 17(0) + /n-0)) . (88.5) 


Correspondingly we have for the differential scattering cross section 
do, = |f(@) + f (7—0)? dQ. (88.6) 


If the total spin is equal to one (i.e. the spins are parallel), then the spin 
part of the wave function is symmetric, and the coordinate part is anti- 
symmetric. Hence for this case we can write the following asymptotic ex- 
pression: 


Va = elke —e-ik + ai (10) -f(1—0)] . (88.7) 


Then for the differential cross section we obtain 
do, = If(0) —f(n—-0) |? dQ . (88.8) 


We have considered above processes in which the scattered particles had 
a definite spin orientation. However, in scattering, particles are often in a 
state with indefinite spin. In this case one is usually interested in the mean 
cross section which is obtained by averaging over all possible spin states. The 
mean cross section for particles with spin } can easily be found as follows. 
The colliding particles can be in four states: in one state with spin O and in 
three states with spin | (three possible projections on the z-axis). Since all 
these states are equally probable, the state with spin O has a statistical weight 
equal to }, and the weight of the state with spin 1 is equal to 3. Hence the 
mean differential cross section can be written in the form 


do =4do, + 3do,. (88.9) 


As an example let us consider the scattering of two slow identical particles 
with spin 4, for which the interaction energy can be written in the form 





§89 EFFECT OF POLARIZATION 371 


U=Uy for r<R, 
U=0 fOr TƏR: 


In the case of parallel spins the scattering cross section given by expression 
(88.8) turns out to be equal to zero 


do, = |f(@) —f (7—0)? dn=0. 


Consequently, the scattering of particles with parallel spins is associated with 
effects of higher orders, i.e. with p-wave, d-wave etc. scattering. The cross 
section for the scattering of particles with antiparallel spins at small energies 
is the same as for particles with spin zero, 
_ _ 486 
Oe. = = Gh 
k2 
The mean cross section according to formula (88.9) is given by the 
expression 


Thus we see that taking into account the identity of the particles leads to 
the appearance of a basic dependence of the scattering cross section on the 
mutual orientation of their spins. 

The transition from cross sections calculated in the centre-of-mass system 
to cross sections calculated in the laboratory system is carried out in the same 
way as for spinless particles. 


§89. The effect of polarization in scattering processes 


All the results obtained up to now apply to the scattering of beams in 
which all the particles are in one and the same state, i.e. are described by one 
and the same wave function. However, the particles of a beam can be in 
different spin states. We shall now confine ourselves to considering beams 
made up of particles with spin 4 scattered by non-polarized targets. As is 
known, each of the particles of the beam is described by a two-component 
spinor. 

In §61 it was shown that an arbitrary state of a particle is at the same time 
a state with a definite projection of the spin on a certain direction in space. 
In other words, for a state with an indefinite z-component of the spin one can 
always find some z’-axis with respect to which the given state will be a state 


i 
| 
i 





372 SCATTERING THEORY Ch. 11 


with a definite spin projection. Consequently, we see that if the beam con- 
sists of particles which are in the same state, then it will be fully polarized 
along a certain direction. If the beam is partially polarized, then the particles 
are described by different spinors. In this case the beam cannot be described 
by means of a wave function, and we have a mixture of states (see $23). 
Nevertheless, for the description of the spin properties of the particles of the 
beam one can introduce a function y defined by the formula* 


Y= Cy Ey FCPE +... - 


The summation is carried out over the spin states of the particles of the 
beam. We denote the spinor which describes the group of particles in the kth 
spin state by y- The coefficient c, determines the weight of this state. It is 
proportional to the number of particles in a given group. The quantities €; 
and e, satisfying the condition e? =] and e;e, = 0, are introduced in order 
to eliminate the interference between wave functions of particles which are in 
different spin states in the quadratic expressions defining the mean values. We 
define the polarization vector as the spin vector averaged over the beam: 


D, len KACIM D f 


+ isto} 
ReGa == n PTA (89.1) 
ANR AE Dio 
n 





where pn, = Cpp- 

Formula (89.1) has a simple meaning: (vicy,) represents the mean value 
of the spin vector in the nth state, and the ratio |c,,|7/Z,,|c,,|? determines the 
probability of realization of the mth state in the beam. This probability is 
equal to V,,/N, where N,, is the number of particles in the nth state, and N is 
the total number of particles in the beam. 

We write c,,y,, in the form 


CnPn -( sa) 3 (89.2) 


n 


Substituting expression (89.2) into (89.1), we easily find the components 
of the polarization vector 


* L.Wolfenstein, Phys. Rev. 75 (1943) 1664. 





§89 EFFECT OF POLARIZATION 373 


p) * 
2 Re D UnUn 
n 


2o (lu, 2 +1v,,17) 


n 


5 * 
2Im Dy upv, 
n 


Py 





neo 
D (lul? +v, 17) 


n 


D (lu, l? lv, |7) 


n 


(89.3) 


P, =< 
D) (lu, |? +1v,,(7) 


n 


If one half of the particles constituting the beam is polarized in some 
direction, for example in the positive direction of the z-axis, and the other 
half is polarized in the opposite direction, then the polarization vector P of 
the beam will be equal to zero. Indeed, one group of particles is described 
by the spin functions. 


ci 
cia =( y 
191 0 


while the other group has the spin functions 


0 
ew =(_ NE lel? = leg? . 
2 


Substituting these values into (89.3), we find that the polarization vector is 
equal to zero. 

We now turn to the case of the scattering of particles with spin 4 by a 
target with spin 0. Then the wave function y, describing the process of elastic 
scattering, at large distances has the form 


k ikr 
Y = y eikz tE Ro (89.4) 


Here y is the spinor characterizing the state of the incident particle, and f is 
a certain two-row matrix depending on the scattering angles. Let us establish 








374 SCATTERING THEORY Chri 


the general form of this matrix. First of all we note that any two-row matrix 
can be expressed in terms of a unit matrix and the Pauli matrices 0, 0,,, 03, 
since the matrices mentioned make up a complete system (see §60). Corre- 
spondingly we have 


f= 26) +h(@)-o. (89.5) 


The further form of the functions g and h can be obtained from the following 
considerations. The laws of transformation of the first and second terms of 
formula (89.4) must be the same under spatial rotations and reflections. Since 
the first term transforms as a spinor, the second term in this formula must 
also have the character of a spinor. Hence it follows that the function g must 
be a scalar. Since the operator o transforms as a pseudovector, h also must be 
a pseudovector. On the other hand, the pseudovector h depends on the quan- 
tities which characterize the scattering process, and can be defined by only 
two vectors kg and k, (the wave vectors of the particle before and after 
scattering). From these two vectors one can construct the single unit pseudo- 
vector 


_ kX ky 
© [ky X kyl” 


Hence h= A(0)n where A(0) is a scalar. 
Finally we obtain 


n 


f=gOM +n- oh(0). (89.6) 
Correspondingly, the elastic scattering cross section has the form 
T -fifo gl? +h? +2Re(g*h)w-n. (89.7) 


where p =yioy. 
We average expression (89.7) over the spin states of the particles of the 
incident beam. Then, making use of (89.1), we find 


d $ 
Io = |gi? + h|? + 2Re(g*h)Ping: n= 
2 Re(g*h)P en 
= (gl? +h?) hı E) k (89.8) 
gi? + |l? 


where Pinc is the polarization vector of the incident beam. 
If the incident beam is not polarized (Pinc = 0), then the differential cross 
section is equal to 


§89 EFFECT OF POLARIZATION 375 


do 
Ia = igl? F Ih|2 (89.9) 


We now turn to the investigation of the state of the scattered beam. We 
stress that a polarization of the beam can arise after scattering even in the 
case where the incident beam was not polarized. From general considerations 
it is easy to indicate the direction of polarization of the scattered beam. As a 
matter of fact, the polarization is described by the pseudovector P, which can 
be oriented only in the direction of the pseudovector n. Consequently, for 
the scattered beam, which was not polarized before scattering, we have 


Pacat ~ Pscat - (89.10) 
We shall determine the value of the polarization of the scattered beam. Based 


on definition (89.1), we have 


D Cotos) D AST ofon 
— i n 


scat 





= =e (89.11) 
Fo dtm) 2 htt fen 
n n 
Since by assumption the incident beam is not polarized, it can be repre- 
sented in the form of two beams consisting of the same number of particles 
but with oppositely directed spins. Then the summation over n reduces to the 
summation over two states characterized by oppositely directed spins. 
Consequently, we have 
2 


Divi fi ofo; 
_ i=] 


xat > 
D olf fei 
i=1 


We see that to calculate the polarization it is necessary to find the sums 
of the diagonal elements (traces) of the matrices of certain operators. In the 
notation of §45 this last formula can be rewritten in the form 


Pee weil (89.12) 
Trftf 
We calculate first Trfiof. Making use of expression (89.6), we have 


Trftof= Tr {[g*/+h*(n-)] o[g/th(n-6)]}} . 





| i: ioe be 


— 


-m M 





376 SCATTERING THEORY Cen bl 


It is easily seen from formulae (60.15) and (60.16) that Tr g; = 0 (i=1 ,2,3). 
From relations of the type o0, = io, it follows that 


Tr 0,0, =0 (i#k) . 


Since o2 =], then Tr o? =2 (i=1,2,3). 
Using these relations, we obtain 


Tr ftof=Tr [g*ho(n: 0) + h*g(n-0o)o] = 
=Tr G*hth* g\(oznyito,ny ij +o2n,k) =4nRe(g*h). 


By means of analogous calculations we find 


Trftf = 2(h|?+81?) . 
Thus the polarization vector of the scattered beam has the form 


2 Re (g*h) _ kxky 
SS NG Noam (89.13 
scat gI? a h|? ikXk;l ) 


Making use of (89.13) and (89.8), we express the scattering cross section in 
terms of the polarization vector P gat: 


ee (lgl2+1h12)(1 +Pine* Piat) » (89.14) 


where Pat is the polarization vector of the beam of scattered particles in the 
case where the beam was not polarized before scattering. 

Thus we see that the scattering cross section depends on the polarization 
‘of the incident and scattered beams. Experimentally such dependences can 
be observed in experiments on double scattering. An unpolarized beam of 
particles (fig. V.26) becomes polarized after scattering. Then the polarized 
beam of particles falls on the second scatterer. In this case the cross section 
for scattering to the left (vector k3) turns out to be different from the cross 
section for scattering to the right (vector k3). 

For simplicity we assume that all the vectors k, k}, ky and k5 lie in one 
plane. The vector n, characterizing the polarization after the first scattering, 
is directed upwards perpendicular to the plane of the drawing. The vectors 
Pat involved in formula (89.14) have opposite directions for beams scat- 
tered a second time to the left and to the right because of the different 
directions of the vectors kọ and k3. Thus the cross section for the beam 
scattered to the left is equal to 


§89 EFFECT OF POLARIZATION 377 





Fig. V.26 


= (PHAD PaO) P a0] (89.15) 


Correspondingly, for the beam scattered to the right we have 


a = (el? +I? LI Psat O1) Prcat(2)I - (89.16) 


We see that the ratio of the number of particles scattered to the left and to 
the right is determined by the polarization P,.,,. We have 


l HFPa] ) Prcat( 2) 
R=—————.. (89.17) 
l =P cat lh 1) Prcat( 2) 
As an example let us consider the scattering of a neutron by a nucleus taking 
into account the spin—orbit interaction between them. The concept of this 
interaction was first introduced by Fermi to account for the phenomenon of 
polarization of fast neutrons. It has the form 


H'=V(r)+ Wot. (89.18) 


Here V(r) and W(r) are functions depending only on the radius, and 1 is the 
neutron orbital angular momentum operator. 

From experiment it follows that parity is conserved in nuclear interactions. 
Operator (89.18) is constructed in such a way that it automatically satisfies 
this conservation law. For what follows it is convenient to write the function 
W(r) in the form 


Wr)=2 YO). 


Qla 





378 SCATTERING THEORY Ch. 11 


Let us find the functions g and h by making use of the Born approximation. 
As was shown in §84, the amplitude f in this approximation is equal to 


m 4 a - 
ME eikit 7 (r) eikotTqy= 
sal 
=” [vk k +a feitir dX trxyjeikorav] , 
Anh2 Qa i rdr 
where Vę-pķ is the Fourier component of the function V. By means of 
elementary transformations we find 


m i = ` 
jo- sn Vi ste [koX feiko-ko t vyYav]} . 


Integrating by parts we obtain 








J=- Who ke, to: [koX [Yv eiko-k)ray] } = 
2nh2 Oe ad 
m f 
E {V,,-k, ~ #0: [koX ky] Yko-k; } - (89.19) 
Comparing (89.19) and (89.6) we find the functions %4 and g to be 
m 
& a aera V] - ? 
ann Ko Ki 
hE imk? sin 0 
2nh Koski 
We note that in the first approximation of the perturbation theory con- 
sidered there is no polarization of the scattered particles. Indeed, substituting 
relation (89.20) into formula (89.13) we obtain Pat = 0. However, in a more 
accurate calculation Poat #0. 
A more general formalism, suitable for the treatment of the scattering of 
particles by polarized targets, may be found, for example, in the book by 
Davydov*. 


(89.20) 


§90. The transition to the classical limit in the quantum scattering formulae 


We first of all transform the exact formula for the scattering amplitude 
into a form convenient for transition to the classical limit. 


* A.S.Davydovy, Theorie des Atomkerns (Deutscher Verlag der Wissenschaften, Berlin, 
1963). 


§90 TRANSITION TO THE CLASSICAL LIMIT 379 
If we make use of the expansion of the 6-function in terms of the 


Legendre polynomials (III.11), then the scattering amplitude (86.11) can be 
written in the form 


f(0) = sig Z (211) Pacos 6) e761 4, 5(1-cos 0). (90.1) 


For all angles 0 #0 formula (90.1) assumes the form 


£(0) = oT 27 (2/+1) P(cos ye! , (90.2) 
i 


In the quasi-classical approximation the radial part of the wave function has 
the form (43.2) 


A 


h? (l+ A 
Ri =— sin E S * (2m [E-U(r)] — a ) dr + ir] ; 
TP a ré 





The expression for R; must be understood as an asymptotic expression, 
i.e. it must be assumed that r+ œ;a denotes the coordinate of the turning 
point, where the total energy Æ is equal to the sum of the potential and 
centrifugal energies, i.e. 


n>(I44)? 
2ma2 ` 
In §43 the condition for definition of the turning point did not involve 
the centrifugal energy, since the motion was assumed to be one-dimensional. 


Comparing the expression for R; with formula (86.2) we see that the 
scattering phase shift can be written in the form 


E = U(a) + 


ies +1) 5 
es ~+) dr +4n(I4h)—ka . (90.3) 


6, = A all (2mte-ve - 


In (90.3) it is necessary to assume that r > œ, / > 1. Then the values of the 
phase shifts ô; are very large in absolute magnitude. The formula for the 
scattering amplitude (90.2) can be simplified by taking into account that in 
the quasi-classical approximation / must be assumed >1. Then for the 
Legendre polynomials P,(cos@) one can write an asymptotic expression for 
l> 1. They have the form* 


* N.N. Lebedev, Special functions and their application (Prentice Hall, Englewood 
Cliffs, N.J., 1965). 


EEE oe 





380 SCATTERING THEORY Ch. 11 


P,(cos 0) = +. [eil/+4)0#4i9 _ g-ill+4)0-hin] 
i(27l sin 0)? 
Then for the scattering amplitude we obtain 
ret D> IP,(cos0) e”?! = i D BO (cia_ei8) , (90.4) 
I>} 


where 





BO (z a a) i 


a(l) = 26, + (1+})0 +47, 
BW) = 26, — (1+3)0 — 4r . 


To obtain f(@) it is necessary to sum the series 
x Bee and 27 B) e20, 
l 


We shall consider one of these series, since, as will be clear from what follows, 
only one of the series has a sum different from zero in a given force field 
(repulsive or attractive). The quantities a(/), as can be seen from their defini- 
tion, are large for large /. Hence the terms of the series © B(/) ei) contain- 
ing rapidly oscillating factors, are mutually cancelled. An exception is pos- 
sible in the case where for a certain value /=/9 the quantity a(/o) has an 
extremum, i.e. 


(<2) ous 0. (90.5) 
1=lo 

Near the extremum the function a(/) changes slowly and the sum of the series 
reduces to a sum of terms with values of / close to Ig. 

In this case, to carry out the summation the sum can be replaced by an 
integral. In the integral the integrand is substantially different from zero only 
for lœ lg, and the integral can be calculated by the method of steepest 
descent (see Part III, §20). Thus one can write 


3 Dae 2 4 A ee 
ys BY) eia() = B(ly) eilo) f eit—lo) gy = elo) BUI) f o c(l-lo) dl, 
I —oco —0o (90.6) 


where 


§90 TRANSITION TO THE CLASSICAL LIMIT 381 


Bt ia oo wy 
2 a =I : 


The calculation of the integral in (90.6) is carried out directly, and we obtain 


DBe'™ = Bly) etlo (y ; (90.7) 
l 


By means of this relation the scattering amplitude can be written in the 
form 


B(lo) ; im \? 
10) ==> loco (in) ' (90.8) 


We shall later find the quantity a{/9), but for the present we shall consider 
the physical meaning of eq. (90.5). For this we define the derivative 
(da/d/))-;, - By means of relations (90.4) we have 


da st) 
nates = 7 | —— = 
(<<) 1E +0=0. (90.9) 
12l [=I 


In differentiating it should be recalled that the angle @ is given, and that 
we determine the cross section for a definite value of the angle. If we differ- 
entiate with respect to / and make use of formula (90.3), we then obtain 





(<°i) shal n(lgt})dr 
WT p29 =i) r? [2m(E-U) -A2 (Ig 2/2] 


3 da da da m 
= —-U)—}2(1+Ł)2 r2 Se Bess E ee UE 
[ome U)—ħ4(l+5) Py, (S tka kii? 





Nir 
a 


Be = A(Igth)dr $ 
pa r? [2m(E-U) —h? (Ip +4 )2/r2]2 


since at the turning point r=a the square root reduces to zero. Condition 
(90.9) assumes the form 





i A(lg+4)dr 


+in+10=0. 90.10 
o r [2m(E-U) —#2(19 +44)2/r?]? Gua a ( ) 


If we carried out the corresponding calculations for the second sum, then 











382 SCATTERING THEORY Ch. 11 


lower sign in (90.10) would correspond to the extremum {(/). For brevity 
the two conditions are combined. Formula (90.10) defines the value of lọ. 

The quantity 7?(/9+})=ZL represents the angular momentum. After intro- 
ducing the angular momentum ZL formula (90.10) can be transformed into 
the form 


oo 


Ldr rA 
e e: Cane 


In classical mechanics the angular momentum can be connected with the 
impact parameter p by means of the following relation: 


L=mpv, 


where v is the velocity of the particle at infinity. Substituting this value for 
the angular momentum into formula (90.11), we obtain an expression which 
is exactly the same as the classical relation connecting the impact parameter 
with the scattering angle 0* 


co 


a re A 
J r2 [2m(E—U) — (mup/r)?]= (#0) (90.12) 


The values of the impact parameter p is determined by the positive root 
of eq. (90.12). It is known from mechanics that in a repulsive force field the 
positive root of this equation exists only for a negative 0. On the contrary, 
in an attractive force field this root exists for a positive 0. 

Let us consider the case of repulsive forces. Then condition (90.9) can be 
fulfilled only for a(/) but not for B(/). Correspondingly only the first of the 
series in (90.4) has a sum different from zero. 

We now go on to the calculation of the cross section. According to 
formulae (83.5) and (90.8), it is defined by the expression 


1 T 
do = |f (0) dQ =—|BU)I? —dQ. 
a= |f(0) za BC in 


The quantity y is defined by expression (90.6). By means of (90.4) and 
(90.3) we obtain 


* See, for example, L.D.Landau and E.M.Lifshitz, Mechanics (Pergamon Press, Ox- 
ford, 1960). 


§90 TRANSITION TO THE CLASSICAL LIMIT 383 
=h As a [2m(E—U) —L?/r2]z dr = 
aL? 


= 2 fF o 
ðL a [2m(E—U) — L?/r2?]z 





(90.13) 


Making use of (90.11) we transform the expression for y into the form 


ñ 00 
yar 


+2 ðL ` 
If the value of B from (90.4) is substituted and the value found for x is 
used, then the differential cross section takes the form 


L 


o =|f(0)|? dQ = 
) v2 sind 


A z zI dQ. (90.14) 


2 


Replacing the quantity Z in (90.14) by its classical value, we obtain 


o= 2 |e aoi 90.15) 


Expression (90.15) represents the ordinary scattering formula given by 
classical mechanics. 

We examine the limits of applicability of the formulae (90.15) for the 
scattering cross section. They can be established from the following obvious 
considerations. 

One can speak of the motion of a particle in a trajectory in the case 
where the corresponding wavelength is small in comparison with the size of 
the system. In the given case the wavelength must be small in comparison 
with the size of the region in which a considerable interaction takes place. 
If the size of this region is denoted by R, then this requirement can be 
written in the form 


ASR, (90.16) 


where À is the de Broglie wavelength. 
Substituting the value of A into formula (90.16) we find 


2mh 


Ree (90.17) 


In order that the behaviour of the particle may be characterized by classi- 





384 SCATTERING THEORY Ch. 11 


cal concepts, it is necessary that the quantum-mechanical uncertainties be 
small. In other words, it is necessary that the following relations be fulfilled: 


A0 Ap 

z os (90.18) 
where p is the classical impact parameter, and A@ and Ap are respectively 
the quantum-mechanical uncertainties for the scattering angle @ and the im- 


pact parameter p. 
For the quantity A0 one can write an expression valid in order of mag- 


nitude 


> 


A0 ~ Se (90.19) 


where Ap is the uncertainty in the transverse component of the momentum. 
Making use of the uncertainty relation for the coordinate and momentum 


Ap: Ap~ti 
and eliminating the quantity Ap from (90.19), and then using (90.18), we 
obtain 


ui h 

0 > A0 ToD > a (90.20) 

This condition assumes a considerably simpler form if the scattering angles 
are small. Namely, in this case the scattering angle 0 can be found in a simple 
way. It is equal to the ratio of the value of the transverse momentum acquired 
by the scattered particle in traversing the field of the scatterer to the longi- 
tudinal momentum. The transverse momentum is equal to the force U'(p) 
acting on the particle multiplied by the time 7 for which this force acts: 
T= p/v. Thus the scattering angle 0 is, in order of magnitude, equal to 


p 

6 ~ [U' (p) a (90.21) 

Or, substituting (90.21) into (90.20), we find the condition of applicability 
of the theory 


IU (oip? > hv . 


But if the derivative U'(o) is replaced by U(p)/p, then the condition of 
applicability can be rewritten in the form 


Up) >=. (90.22) 





§91 INELASTIC SCATTERING. ABSORPTION OF PARTICLES 385 


Comparing (90.22) with the condition of applicability of the Born ap- 
proximation (84.12), we see that the conditions are opposite to each other. 
Thus these methods supplement each other to a considerable degree. 


§91. The general theory of inelastic scattering and the absorption of particles 


So far we have confined ourselves to consideration of the elastic scattering 
process. We now turn to the more general case where inelastic scattering is 
also possible. 

Any process in which the internal state of the particles changes is said to 
be inelastic. Thus, for example, collisions accompanied by an excitation (for 
instance, an excitation of the atom or nucleus), by a decay or by the produc- 
tion of new particles, for example, are inelastic. Each of the possible processes 
is called a reaction channel. If the process is compatible with conservation 
laws, the channel is said to be open. In what follows we shall consider 
processes for which the inelastic and elastic reaction channels are open. We 
shall begin with a generalization of partial wave scattering theory. This will 
allow us to cover, at the same time, processes of elastic and inelastic scatter- 
ing and absorption. For a formal description of any scattering process we 
shall surround the scattering centre by a fictitious sphere of sufficiently 
large radius Rg. 

Let us consider the character of the /th partial wave for r > Rg in three 
cases: 

(1) at the origin there is no scattering centre, 

(2) at the origin there is a scattering centre at which the particle undergoes 
only elastic scattering, 

(3) at the origin there is a scattering centre at which the particle undergoes 
inelastic scattering. 

In the first case the radial function of the /th partial wave can be written 
(see (36.10)) in the form of a superposition of two waves 


ei(kr—jrl) e`ilkr-4nl) _— sin (kr—4nl) 


ieee ale o Po Kr 





The second term represents a converging wave, and the first term a diverg- 
ing wave. Here we make use of asymptotic expressions, since by assumption 
Ro is sufficiently large. The amplitudes and phases of the two waves are the 
same and the wave function R; is the product of a real function and a constant 
factor. Hence the flux through a closed surface is equal to zero: 





386 SCATTERING THEORY Ch. 11 


aL = i 
mi CE ai ðr 





i= )Pan=o, 


where 


Ww, =P)(cos@)R,(r) . 
In the second case the radial function of the /th partial wave is written, 
according to (86.2), in the form 
sin (kr+5,—471) oP! citkr—4al) _ 9-itkr—471) 


R,=B, Tae =F; Dikr OLN 








The amplitudes of the converging and diverging waves differ from each other 
by the phase factor e7!®/, where |e7!°/| = 1. In this case the total partial flux 
through the surface of the sphere is also equal to zero (the partial wave 
function depending on / is real). Hence it follows that the diverging and 
converging fluxes of the /th partial wave are equal to each other. The fact 
that the converging and diverging waves have different coefficients, e7!81 and 
unity, does not contradict this equality, since le% = 1. 

In the third case, where the particles undergo inelastic scattering, it is 
impossible to write a general expression for the radial function taking into 
account all possible inelastic processes. We can, however, simplify the problem 
if we consider elastic scattering separately from all possible forms of inelastic 
scattering. 

In this case we can write the following formal expression for the radial 
function of the /th partial wave describing the elastic scattering of a particle: 

S; eilkr-inl) _ e-i(kr— inl) 
R, = b; ol ae (91.2) 

This expression is constructed according to the same principle as (91.1), 
but it takes into account the particular process, in which inelastic processes 
or absorption may exist along with elastic scattering. The coefficient S} in- 
troduced is in magnitude less than one. This expresses the fact that in the 
presence of absorption or inelastic scattering the converging flux of elastically 
scattered particles is larger than the diverging flux. Then the wave function is 
written in the form 


v= 2, 


S) el(kr—z71) Te —i(kr—47n1) 


= y 
Zikr P,(cos@) 2 Y. 


The coefficients b, are again defined by the requirement that the wave func- 


§91 INELASTIC SCATTERING. ABSORPTION OF PARTICLES 387 


tion y be the same as (83.3). On carrying out calculations analogous to those 
for elastic ae we find the wave function in the form 


y= 2 


aim 2/+1 LS, eilkr—}7l) _ e~ikr—37!)) P,(cos 0) . (91 28) 


It is easily shown that the flux of elastically scattered particles with given 
angular momentum through a sphere of radius r > Rọ is different from zero. 
Indeed, we have 

aw, i!(21+1) P(cos 8) 
= l : (kri N” S 
JAE Dikr [ikS,eKr— a7) + ik e-ilkr-3xD] P(cos A) . 





Here only terms proportional to 77! are retained in the expression for 
dw,/dr. Terms proportional to r? are dropped, since we desire to find the 
flux through a sphere of large radius. We further calculate the total flux of 
particles through a sphere of radius r > Ro. It is equal to 


ghia Ow, avy 
Nom" Pf (vi y he or ae 
Substituting the functions W; and dw,/dr into the expression for the flux 


and taking into account the conditions of normalization of the Legendre 
polynomials P)(cos 0), we obtain 


Te u 2 
j= — 7z CHIDO -IS1?). (91.3) 


Since |S;| <1, the flux is negative. This means that the total flux is directed 
inwards through the sphere. 

It is easy to understand the meaning of this result: the flux of particles 
incident on the centre with angular momentum / turns out to be larger than 
the flux of elastically scattered particles. The particles undergo inelastic 
scattering or absorption, and the intensity of the beam of elastically scattered 
particles is reduced. It is clear that on dividing the flux j; by the flux density 
of incident particles we find, by definition, the partial inelastic scattering 
cross section. Here inelastic scattering is understood to be all processes reduc- 
ing the intensity of elastic scattering. Since the flux density of incident parti- 
cles is equal to v, then for the /th partial inelastic scattering cross section we 
obtain 


Cs ar (21+1)(1 15,12) . (91.4) 





388 SCATTERING THEORY Ch. 11 


As for the elastic scattering amplitude, we can, without reproducing the 
calculations of §86, write for it the expression 


JO) = 5 E (2141 (S)-1) (cos 6) , (91.5) 
1=0 


since formula (91.1) differs from (91.2) by the substitution of S; for e7!5/. 
The set of complex quantities S, defines the cross section for inelastic 
as well as elastic scattering. In particular, if $, = e281, where 6, is real, then 
the inelastic scattering cross section reduces to zero, and the elastic scattering 
amplitude is the same as expression (86.11). 
Besides the /th partial cross section for elastic and inelastic scattering 
processes one can also write the total cross sections for the processes. 
The total inelastic scattering cross section is evidently equal to 


Cinel 


=Z D ems?) = Do, (91.6) 
k? 120 1 


and the total elastic scattering cross section is equal to 


oa = ff)? eng CNS (91.7) 


We now turn to the consideration of formula (91.6). 

Each cross section g; can be pictured as a characteristic of the process of 
inelastic scattering or absorption of particles with angular momentum /. Since 
the quantity IS? <1, it can be stated that the partial cross section o; has 
the upper limit 0) max = mk-2(21+1). 

The structure of formula (91.6) and the physical meaning of the coeffi- 
cient 1-5)? can easily be understood by means of the following reasoning 
based on the quasi-classical approximation. 

The collision parameter of a particle can be (see (86.14)) written in the 
form 


p, = 0D 01.8) 


For large / we obtain 


§91 INELASTIC SCATTERING. ABSORPTION OF PARTICLES 389 
h 
RESU 
p 


The area of the annulus lying between two circles of radii p; and p} is equal 
to 


hi 52 
2np) ex mh (2141). 
2 p2 


The number of particles passing through this annulus oriented perpen- 
dicularly to the incident flux can easily be found. If the flux density of inci- 
dent particles is equal to Pe then the number of particles crossing the ring 
is numerically equal to 7hi2p~2(2/+1). 

We introduce the so-called absorption coefficient £, which by definition 
represents the ratio of the number of absorbed particles incident on a given 
surface to the total number of particles incident on this surface. The number 
of particles absorbed by the surface of the annulus defined by radii p} and 
P+; is defined by the expression mh p-? (21+1)é;, and, correspondingly, the 
absorption cross section will have the form 


7 
Cra (21+1)€, . (91.9) 


Comparing formulae (91.6) and (91.9) we see that 
1- IS? =, (91.10) 


i.e. the quantity 1-|S,l? is the absorption coefficient. 
Finally, we also obtain the formula relating elastic and inelastic scattering 
cross sections. It turns out that the following equality holds: 


4 
Im FO) = inet + Oel - (91.11) 


To obtain this relation we shall calculate the sum of elastic and inelastic 
cross sections. By means of (91.6) and (91.7) we have 


co co 
Opel + Oot = 2 DOE- = Z D (21+1X(2-2 Re S). (91.12) 
1=0 1=0 


On the other hand, since the Legendre polynomials for @ = 0 are equal to 
one, we have for the scattering amplitude 


eae 





390 SCATTERING THEORY Ch. 11 


f0)= 55 D CHIKS), 


and the imaginary part of the scattering amplitude is equal to 


Im f(0)= 4 © CHI)I-Re s) - 
1=0 


Comparing the expressions obtained, we see that eq. (91.11) is valid. Thus 
we have shown that the sum of inelastic and elastic scattering cross sections 
is proportional to the imaginary part of the scattering amplitude taken for the 
value of the angle 0 =0. Formula (91.11) is called the optical theorem. 

In conclusion we note that the absorption of particles can be described by 
introducing a complex potential U=V,—iV2, where V, and V are real 
functions. The imaginary part of the potential characterizes the absorption 
or emission of particles. As a matter of fact, in this case the Schrodinger 
equation has the form 


Ow u Bers 
nW- (2v4 -in )v. (91.13) 


Carrying out calculations analogous to those of §7, we obtain 


_ 2V⁄y*% 
Pt yj- 0, 01.14) 
where 
* 7 A * * 
p=V"y, j=5 lvvv* -v* Vy]. 


2mi 


In the stationary case for V, equal to zero, Y-j=0, which corresponds to 
the absence of absorption or emission of particles. If V} #0, then we obtain 


v -j S 2V p/h 


Depending on the sign of V, this formula describes the absorption or emis- 
sion of particles. 








un 
© 
N 


DIFFRACTION OF FAST NEUTRONS BY NUCLEI 391 
§92. The diffraction of fast neutrons by nuclei 


The study of the interaction of fast neutrons with nuclei shows that in 
the region of neutron energies above a few tens of MeV for light nuclei and a 
few hundreds of MeV for heavy nuclei a very intense capture of neutrons 
takes place. 

The intense absorption of neutrons is also accompanied by their elastic 
scattering. In describing the strong absorption of fast neutrons the following 
optical analogy turns out to be very useful. The nucleus behaves with respect 
to the neutrons as a perfectly absorbing (black) sphere on which a light wave 
is incident. The absorption of the light wave by the black sphere is accom- 
panied by its perturbation in the region of space near the absorber. This 
means that in addition to absorption, light scattering occurs. Analogously to 
this the absorption of neutrons by a nucleus will perturb their wave function 
and the neutrons will undergo elastic scattering. 

To calculate the neutron elastic scattering cross section we shall make use 
of the analogy with optical phenomena. In §36 of Part IV we have seen that 
diffraction phenomena occur when the wavelength of the light is less than the 
radius of the scattering sphere. In this case the intensity of light scattered by 
a black sphere of radius R into solid angle dQ2 is given by expression (36.13) 
of Part IV 

I pee 2 
E 0 





Q, (92.1) 


where @ is the angle between the direction of motion of the scattered light 
and the initial direction of its incidence, / is the total intensity of light 
incident on the screen, and J, is the Bessel function of first order. 

Simple estimation shows that the wavelength of neutrons of an energy of 
the order of 1 MeV is smaller by a factor of several hundred than the nuclear 
size. Therefore the optical formula (92.1) can be applied to the scattering of 
neutrons by the absorbing nucleus. To obtain the neutron differential scatter- 
ing cross section the flux of neutrons scattered into angle dQ must be divided 
by the incident neutron flux density J/7R2. We then have 


J (kRO) 
0 


2 





do = R? | (92.2) 


This expression, of course, can also be obtained from the general formula 
(91.5). 
From the condition of ‘blackness’ of the nucleus it follows that the 





| 





ae 


392 SCATTERING THEORY Ch. 11 
absorption coefficient & is equal to one for those / for which p <R and 
£ = 0, if p; > R. Since p} ~ hil/p (see §91), then 
0 1<pR/h=kR , 
S= 
l 1>pR/h=kR , 


where KR > 1. Substituting these values of S, into (91.5) we find the elastic 
scattering amplitude 


kR 
f@)=— a D (2/+1) P(cos 3) . 
1=0 


The dominant part in the sum is played by terms with large 7. Hence we 
can disregard unity in comparison with 2/, and use for the Legendre poly- 
nomial P;(cos 3) the approximate expression valid for 3 < 1*. 


P,(cos 9) =Jq[(I43)9] = 7909) 


and pass from a summation over / to an integration 


: kR F 
SO)=È f upasyar="By, RO). 
0 


From which we immediately obtain expression (92.2) for the cross section. 
Let us consider the dependence of the differential cross section (92.2) on 

the scattering angle @ in more detail. The differential cross section does not 

depend on the azimuthal angle. Evidently we have 

Ji (kR0)|2 _ 


do = 2nR? odo . (92.3) 











-For small angles kRO < 1, we expand the Bessel function in a series and 
find J (kR0)=}kR0. Consequently, for small angles the cross section 
assumes the form 


do =}k?R4 dQ. (92.4) 
This is independent of the scattering angle 08. 


* See, for example, L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon 
Press, Oxford, 1965). 





-y 


§92 DIFFRACTION OF FAST NEUTRONS BY NUCLEI 393 


For larger angles up to values lying in the interval 1 > @ > 1/KR one can 
write the asymptotic expression for the Bessel function 


2 
J&R) ~( ) sin (KR@—47) . 


2 
7kRO 
In this range of angles the cross section, which oscillates, decreases rapidly 
with increasing 0. The value of the cross section at the maxima decreases in 
proportion to 073. 

Thus the cross section has a sharp maximum for scattering at an angle 
Ə <0, i.e. for forward scattering in directions close to the direction of the 
incident beam. 

The total scattering cross section o can be found by integrating (92.3) 
over the entire solid angle, 


J2 
o= 2nR? f 3 sin 0 do : 


In view of the rapid convergence of the integral the contribution to it by 
large values of @ is small, and the upper limit of the integral can approxi- 
mately be replaced by infinity. Then, making use of the formula 


oo 


2 
[pee 


ii 
x A 
0 


we find finally 
o=nR?. (92.5) 


The total cross section for the scattering of neutrons with A < R is the 
same as the geometric cross section of the nucleus. 

Let us also define the total cross section for the absorption of neutrons by 
a nucleus. Making use of the expression for S, and substituting it into 
formula (91.6) we obtain 


kR 


Onel = 2a (2141) = 7R? . (92.6) 
k? 120 


Consequently, the cross section for the absorption of neutrons by a black 
nucleus is also the same as the geometric cross section of the nucleus. 

From relations (92.5) and (92.6) it follows that the total cross section for 
the interaction of neutrons with a nucleus is equal to twice the geometric 
cross section of the nucleus, 





394 SCATTERING THEORY Ch. 11 


Oincl + Fe = 2R? . (92.7) 


By analogous methods one can also calculate the cross sections for scatter- 
ing by nuclei which only partially absorb the neutrons incident on them, as 
well as the diffraction of charged particles by nuclei*. 


§93. The scattering of slow particles. The threshold approximation 


We shall apply the partial wave scattering formulae already obtained to 
find the cross sections for the elastic and inelastic scattering of slow particles. 
As in §86, we shall understand slow particles to be those whose de Broglie 
wavelength A is large compared with the size of the region of interaction. We 
shall restrict ourselves to the case where the interaction energy decreases 
sufficiently rapidly with increasing distance, that an effective radius R of the 
interaction sphere can be introduced. 

As we have seen in §86, only s-wave scattering is significant at small 
energies. The radial part of the wave function corresponding to the angular 
momentum / = 0 satisfies eq. (35.8) 

n2 1 a (, Ro 
— 2m }2 ar (2 E) + U(r) Ro =ER9. 

Introducing the wave number k = (2mE/n2):, the above equation can be 

rewritten in the form 


Mi r 2m 
Rg Rot KR ——> UC) Ro- 0. (93.1) 


For r >a, the potential energy is equal to zero outside the interaction region 
and the equation for the function Rg assumes the form 


r 2 , = 
Ro +7 Ro +K? Ro = 0. (93.2) 
It is clear that the potential energy does not reduce to zero sharply at a 
certain limit, but changes over a transitional region according to a complex 
and usually unknown law. 

Therefore at first sight the finding of the wave function over all space 


seems to be an extremely complex problem. In reality, however, this is not so. 


* For more details see A.I.Akhiezer and I.Ya.Pomeranchuk, Nekotorye voprosy teorii 
yadra (Some problems in nuclear theory) (Gostekhizdat, Moscow, 1950). 





§93 SCATTERING OF SLOW PARTICLES 395 


It turns out that use can be made of the large value of A or, what is the 
same, of the small k, essentially to simplify the problem. Namely, in the 
region r <a, i.e. in the region of effective interaction, the term k? can be 
neglected since it is small in comparison with 2mf-2U(r). Then we have 


2 


r 


Rae GR, = 0 (93.3) 
0 h2 0 3 . 


Ro + 
The solution of eq. (93.3) for a given function U(r) can be written in the 
form Ro(7,c1,¢2), where c, and c, are two arbitrary constants. Since (9333) 
does not involve the quantity k, we shall assume the wave function to be in- 
dependent of k in the regionr <a. 
The solution of eq. (93.2) is 


o Te (c3 elkr+c, e-ikr) | (93.4) 
Here c} and cg are two constants of integration which do not depend on r 
but, generally speaking, are functions of k. 

The equation for the wave function cannot be written for the transitional 
region, since the behaviour of the potential energy here is unknown. How- 
ever, the width of the intermediate region is small in comparison with the size 
of the interaction region and is very small in comparison with the wavelength 
À. 

Nevertheless, a substantial change of the wave function takes place over a 
distance A. Hence one can disregard the change of the wave function in the 
transition region and replace it by a sharp boundary at r=a. At this surface 
the two solutions must join smoothly. 

It is clear, however, that two functions, one of which depends on k as a 
parameter and the other of which does not depend on k at all, can only be 
joined when in the neighbourhood of the boundary of the region the func- 
tion (93.4) also becomes independent of k. 

For r~a the quantity ka is small by definition. Hence expanding the 
exponentials (93.4) in a series in powers of kr and retaining the first two 

“terms of the expansion, we obtain 


€3(k (1 +ikr) + c4(k)C —ikr) 
07 sre e 5 (93.5) 


This expression will not depend on k when the following relations are ful- 
filled 


od 


Ss 


— 





zeve 


"AERP ERTE 


n 


| 


arin ai 


eS 





396 SCATTERING THEORY Ch. 11 
c4(k) + ¢3(k) = 2ika, , 
(93.6) 
03(k) —c4(k) = 2a, , 
where a, and a, are constant quantities independent of k. 
Solving the system of equations (93.6), we find for c3 and c4 
C4 =—@, tikan , 
ae (93.7) 


c3 =ikay +a). 
Comparing expression (93.4) with (93.3), we find the quantity Sg to have 
the form 

So= oe -ae é (93.8) 
Expanding in a series in small values of k, we obtain 

So =l + 2ikay/a, - 


Making use of formulae (91.6) and (91.7) we find the expression for the 
cross sections for elastic and inelastic processes: 


a5 |2 
Fal 


T ATEA, 
Cinel = 77 (-S0?) = mz z 


Oa = [1Sol? = 47 





(93.9) 


From these formulae it follows that the elastic scattering cross section in 
the case considered does not depend on the energy of the scattered particle. 
The inelastic scattering cross section is inversely proportional to the wave 
number k, i.e. inversely proportional to the velocity v of the particle. 

The method which we have used of neglecting the width of the transition 
zone and the replacement of eq. (93.1) by eq. (93.2) in the internal region is 
of a very general character and is called the threshold approximation. This 
approximation can be applied successfully in all cases where the wavelength 
can be considered to be large in comparison with the width of the transition 
region. 

We shall encounter the use of the threshold approximation later. 


§94. The Breit—Wigner formula 


In the preceding sections we have considered the laws of elastic scattering 


§94 THE BREIT—WIGNER FORMULA 397 


as well as of the absorption of particles. We now turn to the study of some 
phenomena occurring in nuclear reactions of the type 


At+a>B+b. (94.1) 


Here A and B are the initial and final nuclei, a is the incident particle and b is 
the particle emerging as a result of the reaction. In order to avoid complica- 
tions associated with the effect of the nuclear electric field, we shall confine 
ourselves to the case where the incident and outgoing particles are neutrons. 

The study of reactions caused by neutrons of relatively small energies 
showed that the cross sections for the reactions as a function of the energy of 
the incident neutrons display maxima at definite energy values. The pheno- 
menon is of a pronounced resonant character and the maxima correspond to 
very narrow neutron energy intervals. 

To account for the resonant character of nuclear reactions Bohr proposed 
the following general scheme of nuclear reactions. Neutron a, penetrating the 
nucleus, strongly interacts with the nuclear particles and transfers its excess 
energy to them. This latter energy is evidently equal to the sum of its kinetic 
energy and the binding energy of the particle in the nucleus Ug. The energy 
brought by the neutron is rapidly distributed among all the nucleons in the 
nucleus, since they interact strongly with each other. As a result a new, so- 
called compound nucleus C arises from nucleus A and the neutron. The 
compound nucleus is not a stable system, since its energy is higher than the 
energy of the normal state by an amount & + Ug. After a certain lapse of a 
time the compound nucleus will make a transition to the normal state. At 
small excitation energies this transition may proceed in one of two ways. 

First, as a result of a fluctuation all the excitation energy may be con- 
centrated on one of the nuclear particles. This particle (for simplicity of the 
argument, a neutron) then has the possibility of escaping from the nucleus, 
with energy Æ. Evidently this mode of reaction corresponds to the elastic 
scattering of the neutron by the nucleus. 

Secondly, only part of the excitation energy may be carried away by the 
escaping neutron. The rest of the excitation energy is emitted by the system 
of nuclear particles in the form of a y-quantum. In this case inelastic scatter- 
ing of the neutron occurs. A particular case of this reaction is the radiative 
capture of the neutron, in which the entire excitation energy is carried away 
by a y-quantum and the neutron remains in the nucleus. 

For the cross sections for the elastic and inelastic scattering use can be 
made of formulae (91.6) and (91.7). We shall restrict ourselves to the case of 
slow neutrons, described by an s-wave, and shall consider the nucleus to be a 
sphere of radius R. Although the nucleus cannot be considered to have sharp 





} 
| 
| 
| 





398 SCATTERING THEORY Ch. 11 


geometric bounds, its diffuseness is very small in comparison with the wave- 
length of the incident neutronA>R. 

The neutron inside the nucleus must be in a state to which there corre- 
sponds a wavelength Aj,,- At the surface r=R the wave functions describing 
the neutron outside and inside the nucleus must join for which the following 
conditions are to be fulfilled: 

WEN dy _ dY int 
m dr da 





It follows from the second condition that the orders of magnitude of the 
amplitudes of the wave functions of the external and internal motions are in 
the ratio ~Aj,,/A. This means that the probability for the particle to get 
inside the nucleus is ~(Ajn,/A). i.e. is very small. 

The corresponding energy is determined by the value of the normal 
derivative of the wave function at the surface of the nucleus. 

We denote by f(E) the quantity 


S(E)=R (ee) : (94.2) 


The quantity f(E) is related directly to the normal derivative (dW/dr),_ p and 
depends on the neutron energy Æ. The quantity So, defining the cross section 
for the elastic and inelastic scattering of the s-wave, can easily be expressed in 
terms of f. 
Substituting the value of Y from (91.2) into (94.2), we find 
 KReKR + KRSo eR 
e-ikR — Sp eikR 


Hence it follows that 


= _e-2ikR KR—if(E) 

= sd ee 

SO ER aif(E), (64:3) 

Since f(E) is, generally speaking, a complex quantity, one can write 
fE)=f,L)-ih® . (94.4) 


where f;(£) and f>(E) are real functions. Since [Sọ] is always <1, then the 
function f2 (E) > 0. 
Taking into account (94.4) we have for Sp 


p RNE) -hE) 


KR +i E+E) 045) 


So = —e72ik. 


§94 THE BREIT—WIGNER FORMULA 399 


Substituting this value of Sọ into (91.6), we find 

















4 kRfa 
ES a (94.6) 
ke k4 (kKR+f,)° +f; 
Analogously from (91.7) it follows that 
T REE N 
04 =— jI-Spi2 = Z 1+e72ikR =la = 
k2 k2 KR +ifi th 
4n | pp KR COSKR -fi sin kR + ify sin KR 2 
= ee - = 
%2 KR +ifi th 
4n kR T 2 
= | + eiKR sin kR| ‘ 94.7 
k2 |i(KR+f>) -f, ( ) 





Let us first discuss the formula for Gine: Since fo > 0, the cross section 
has a maximum for f} (£o) = 0. When the neutron has an energy equal to Eg 
it has a relatively high probability of penetrating the nucleus. Accordingly the 
energy Eg corresponds to a resonance value of the energy of the nucleus. Near 
the resonance energy we can expand the function f} (E) in a series in powers 
of E—Ep and restrict ourselves to the first term of the expansion 


Si (Œ) = f'(EgXE-Ep) - 


It can be shown* that the quantity f'(Eọ) <0. We introduce the notation 





2kR 2f 
De 5 ; PL ==>; r=D +r. 94.8 
OAE E) Aa EE 
Then we find 
PR 
m 
Gra c (94.9) 


k2 (E-E9)? +4r?2 
In formula (94.7) for oq for E ~ Ep the first term is usually large in 
comparison with the second, and one can write 
r2 
Me e 
k2 (E-Ep)* +4r? 
The formulae for the cross sections for elastic and inelastic scattering of slow 


neutrons are called the Breit—Wigner formulae. To explain the physical mean- 
ing of the quantities Fe, I, and F which have been introduced it is useful to 


(Gee ~ (94.10) 


* See A.].Akhiezer and I.Ya.Pomeranchuk, Nekotorye voprosy teorii yadra (Some 
problems in nuclear theory) (Gostekhizdat, Moscow, 1950). 





| 








400 SCATTERING THEORY Ch. 11 


-compare the Breit—Wigner formulae with the dispersion formulae of the theo- 
ty of light scattering (§ 108). We see that the general structure of the formulae 
is the same. This is quite natural, since the Breit—Wigner formulae could be 
obtained by considering the reaction as the transition of the system (nucleus 
+neutron) from the initial into the final state via a compound nucleus as an 
intermediate state (i.e. according to the same scheme as in the scattering of 
photons). Direct application of perturbation theory leads to the Breit—Wigner 
formulae. However this way of obtaining the Breit—Wigner formulae cannot be 
substantiated, since the perturbation of the state of the neutron is not weak. 

Nevertheless, such an obvious although not rigorous calculation shows that 
the quantities D, and I, characterize transition probabilities. Namely, DP, is 
proportional to the matrix element of the transition of the system from the 
intermediate state (nucleus C) to the final state (nucleus A and the neutron 
with energy £). Hence the quantity P., which is called the partial width of 
the resonance level Æ corresponding to elastic scattering, determines the 
probability of decay of the nucleus C with elastic scattering of the neutron. 
The quantity T, is called the partial width of the resonance level with respect 
to the reaction. It determines the probability of decay of the nucleus with 
inelastic scattering and neutron capture. In the case of slow neutrons the 
probability of inelastic scattering is small and the reaction reduces to the 
capture of the neutron. Finally, I determines the total probability of decay 
of nucleus C. It is equal to the energy half-width of the resonance maximum 
of the cross section. 

Fig. V.27 shows the energy dependence of the cross sections for the elastic 
scattering and radiative capture of slow neutrons. We have introduced the 
Breit—Wigner formulae in the particular case where the energy of the neutron 
is close to one of the resonance levels Eg of the nucleus. They can be 
generalized to the case of many levels. They may also take into account the 
spin states of the nucleus and of light particles. Finally, the Breit—Wigner 
formulae can be generalized to the case of charged particles and particles with 
an angular momentum. We discuss certain properties of the widths of the 
resonance levels of a nucleus. The reaction width I’, forslow neutrons reduces 
to the radiative capture width I., since no inelastic scattering occurs at small 
energies. The value of ry, amounts to about 107! eV and does not depend on 
the velocity of the neutron. The width Tą ~k ~v, where v is the velocity 
of the neutron, and at small energies for heavy and medium nuclei D, <I. 
This means that neutron capture predominates over elastic scattering. In the 
case of light nuclei the situation is the reverse: resonance scattering pre- 
dominates over capture. 

The Bohr concept of the formation of a compound nucleus is valid for 


§95 THE SCATTERING MATRIX 401 











Fig. V.27 


nuclear reactions proceeding at not too large energies. As the energy of the 
incident particles increases their cross section for scattering by individual 
nuclear nucleons decreases sharply. Hence at energies £ > 50—100 MeV the 
interaction of the particles with the nucleus reduces to an interaction with an 
individual nucleon. The Breit—Wigner formulae turn out to be no longer 
applicable. 


§95. The scattering matrix (S-matrix) 


The mathematical technique of scattering theory described above is asso- 
ciated with an explicit form of the interaction potential distribution over all 
space. However, in a number of important cases there is no potential energy 
(independent of the velocity). Hence in modern scattering theory an im- 
portant role is played by a more general statement of the problem. Let the 
wave function w,(t+—ee) of a system of particles be given in the initial state 
before the interaction. The general problem of scattering theory is to find the 
wave function of the system a long time after the interaction, W(t>°). The 
wave function W(t) can be expressed in terms of the initial function 
W,(t>—©2) by means of the operator P(t,to) introduced in §49 and describ- 
ing the development of the wave function in time. By the scattering matrix S 
we shall mean the limiting expression of the operator Pt, to) (in the interac- 
tion representation) describing the development of the process in time (see 
§49). 


sero men seem vere eae I ELE 





402 SCATTERING THEORY Ch. 11 
S= lim V(t,t9). (95.1) 
to-—99 
- {~oo 


Thus the scattering matrix § carries out the transformation of the initial state 
Ya(—) into the final state Y (29), 


yle) = Sy, (=°) . (95.2) 


The index a denotes the complete set of quantum numbers defining the state 
of the system before scattering. It is assumed that the particles in the initial 
as well as in final state are separated by sufficiently large distances from each 
other that the interactions between them need not be taken into account (the 


so-called adiabatic hypothesis). 
We expand the function 4 in a series in terms of a certain complete system 
of functions W}, where b denotes the corresponding set of quantum numbers 


y= 2 c Vp . 
b 


Here the symbol 2, denotes summation over a discrete sequence of 
quantum numbers and integration over quantum numbers changing contin- 


uously. 
As follows from (95.2), the expansion coefficients c, are expressed in 


terms of the matrix elements of the operator § 
Cy = (Wy) = Wp SVq) = DISIA) = Spa - (95.3) 


The square of the amplitude c, gives the total probability of the transition of 
the system in scattering from state a into state b 


Wha = ISpal? - (95.4) 


Thus the matrix elements of the operator § are directly related to the corre- 
sponding transition probabilities. 

Since the operator V(t,to) is unitary (see §96), its limiting value is also 
unitary, i.e. for the operator S we can write 


Sst =StS=T, (95.5) 


where / denotes the unit operator. 
Taking the diagonal matrix elements of one of the relations (95.5), we 
obtain the obvious result 





-0T 


| 


§95 THE SCATTERING MATRIX 403 


27 St Sq = 27 Sygl2 = 1, (95.6) 
b b 


i.e. the sum of the probabilities of all possible transitions is equal to one. 
Using relation (95.4), one can express the cross section for the process in 
terms of the matrix elements of the operator §. However, it is first necessary 
to obtain the expression for the transition probability per unit time. 

We assume that the initial state Y, is characterized by a definite energy 
value Æ,- The total energy of the system is conserved in time. Hence the 
matrix Spa can be written in the form 


Siq = SE,8(E,—Ep) - 


The matrix SZ, is said to be given on the energy surface. Then the total 
ba g gy 
transition probability (95.4) is written in the form 


Wia = WSE1757(E,—Ep) - (95.7) 


This probability is proportional to the square of the 5-function. We write 
one of the 6-functions in the form (see Vol. 1, Appendix III) 


IT 
9 os | a 
5(Eq—Ep) = lim, J Tan e GPE, Baar. 


Substituting this expression into (95.7) and integrating the transition proba- 
bility over the energy of the final state, we obtain the transition probability 
in time T 


= oe ell eto) 
Wya = | WhadEp = x5 Shal?T - (95.8) 
Hence we find for the transition probability per unit time 


Wa > ISI? (95.9) 

To find the cross section for the process we have to divide the transition 
probability by the incident particle flux density. 

In the initial state there are two particles. As usual we consider the scatter- 
ing process in the centre-of-mass system. The wave function of the initial 
state W, describes states with given energy of relative motion Æ; and a direc- 
tion of the momentum of relative motion n, = p,/Pg, and is normalized by 
the condition 















i e 


ae ee 


<= 
ewe 








ees 


et 


See ee 





= 
es 


= )_—— =)? 





404 SCATTERING THEORY Ch. 11 
J Vinten AV = 8(E,—E’ ') =p? B s 
EanY Ean’ = 6( (i a Mg =n) =Pa qE ê PaPa) = (9.10) 
Then 
£ i> dp \? _Pa 
Yen = lEn) = pa (2) Vp, yp! Pa?- (95.11) 
a 
The incident particle flux density is equal to 
figs ee Hae (95.12) 
O amp 7 i 


As always when we deal with a continuously changing quantity we have 
to introduce the differential transition probability dW,, and, consequently, 
the differential cross section do,,. Denoting the solid angle interval in which 
the vector n, lies by d92,, we obtain from (95.9) and (95.12) 


4n2 ‘ 
doz, = A |b,E,np ISF la, E, ng)? dQ, , (95.13) 
a 
where k, =fi71p,. 
Let us now consider the case where elastic and different forms of inelastic 
scattering may occur as a result of the interaction of two particles, i.e. 


A+B 
A+B>C+D 
C+D. 
We shall call each form of transformation a reaction channel. Formula (95.13) 


for b#a corresponds to an inelastic reaction channel. The cross section 
taking into account elastic and inelastic channels can be written in the form 


2 
dopa = = \(,E,n,|S£—Ma,E,n,)|? dQ, , (95.14) 

b 
where / is the unit matrix. Since in the matrix / only diagonal elements are 
different from zero, for b #a the cross section (95.14) is the same as (95.13). 
Expressions (95.13) and (95.14) can be written in a form analogous to 
(86.12) if the initial state |¢,£,n,) is expanded in terms of partial waves 


\a,E,n,) = |a,E,l,m)U,m\n,) . (95.15) 


The transformation functions (/,m|n,) were found in §48: 


i 


§95 THE SCATTERING MATRIX 405 

ming) = Yin) - (95.16) 

Choosing the z-axis to be along the direction of the vector n,, we obtain 
e _ (2i+1\? 2 

Yim (Ng) = Yin (0) = are 5 m0 - (95.17) 


In substituting expressions (95.16), (95.17) into (95.13) and (95.14) 
matrix elements of the following form arise: 


(b,E,np |S" |a,£,1,0) . 
In motion in a central field angular momentum is conserved. Hence the 


S-matrix is diagonal with respect to the quantum numbers /, m, and one can 
write 


(b,E,ny |S" |a,E,1,0) = (n; |l,0) ®,E,1, 01 S| a,£,1,0) = 


z A 2i+1\? oy 
= Yjq(n,) b,E,L01S* |a,E,1,0) = Picos 84) —) Sba- (95.18) 


Correspondingly for the differential cross section for scattering into solid 
angle dQ, we obtain 


1 2 
dopo = 25 D (11XS ha-pa) #0080) dQ, « (95.19) 
l 
a 


Integrating this expression over all directions of the vector n,, we obtain the 
cross section for the scattering a > b 


Gin = a D7 (2/41) 1Sh 8 pa)l? - (95.20) 


a l 


From this formula it follows that the total elastic scattering cross section 
has the form 
T X i+) S411? . (95.21) 


0 
aa a 
kil 


We can also write down the expression for the total cross section for all 
inelastic processes Oiņep Which is obtained by summing o,, over all channels 
b#a 


nt = Z Or a D D CH) Spal - 


b#a a b#a l 


=) 


—— 
ee E 





ee eas a 


406 SCATTERING THEORY Ch. 11 


This expression can be transformed by making use of the unitarity of the 
S-matrix. Namely, we have 


DY (st? =1—1sL 2 . (95.22) 
b#a 


Correspondingly for oine] we obtain 
Onel = 5 27 (21+ MI =1S pq?) - (95.23) 
a l 

Formulae (95.23) and (95.21) are the same as formulae (91.6) and (91.7) 
of partial wave scattering theory. We see that the quantities S, introduced in 
§91 are the diagonal matrix OGHENS of the scattering matrix S. If inelastic 
processes are Hues, 1O sl, = =0 for b #a, then from unitarity relation 

(95.22) it follows that iSl |? = 1, i.e. that 


SAEI (95.24) 


Then expressions (95.19)—(95.21) are the same as the expressions for the 
elastic scattering cross section obtained in §86. 

From relations (95.19), (95.21) and (95.22) it follows that the cross 
section for a process is determined by the matrix elements of the operator 
Ê, if =S-7 (the factor i is introduced for convenience). The unitarity of the 
S-matrix leads to the following relation: 


StS = (ifi PÊ) =7 , 
or 
-if +iFt = FÊ. 
Taking the matrix elements of the left-hand and right-hand sides of this 
relation with respect to the wave functions (95.11), we obtain 


Fg —Fha =i Lit Êa > (95.25) 
c 


where X, denotes the summation over the discrete and integration over the 
continuous states of the system of two particles after collision. In fact, the 
system (95.25) is a system of integral equations expressing the property of 
unitarity of the S-matrix. 

The system of equations (95.25) is substantially simplified if only elastic 





§95 THE SCATTERING MATRIX 407 


scattering is possible. As is seen from comparison of (95.19), (86.11) and 
(86.12), in this case the matrix elements of the operator F are to within a 
factor the same as the elastic scattering amplitude f(n’,n) 


7 Fyn = f'n), (95.26) 


where n and n’ are the unit vectors characterizing the direction of the 
momentum vector of the relative motion of the incident and scattered parti- 
cles. From (95.25) we obtain 


f(n'n)— f* (n,n’) = fran yn"n) do” ; (95.27) 


Relation (95.27) expresses the condition of unitarity for elastic scatter- 
ing. For scattering in a central field the amplitude f depends only on the angle 
V between the vectors n and n’, and relation (95.27) can be rewritten in the 
form 


Im fm',n) = Efrain" naa” : (95.28) 


For n=n’ we obtain the relation connecting the imaginary part of the 
amplitude of the scattering at zero angle with the total cross section (optical 
theorem; see $91). 

We note that eq. (95.28) makes it possible, in principle, to find the 
scattering amplitude if its modulus, which is defined by the scattering law, is 
known. Setting 


1 


7(0)= (S) cia 


and substituting this expression into (95.28), we obtain the integral equation 
for the phase a(%). Thus knowing the scattering cross section do/dQ we can, 
in principle, also determine the scattering amplitude f(3). We note, however, 
that eq. (95.28) does not change under the replacement a> 7m—a, i.e. it 
determines the scattering amplitude with an accuracy to within the trans- 
formation f(9) > —f* (9). 

Let us consider the effect of this uncertainty on the values of the scatter- 
ing phase shifts. For this we calculate the integral 








LK Ss ee 


408 SCATTERING THEORY Ch. 11 


JION iO P, (cos 8) sino d0 = 


E D 2l+1)(e7!—-1) P,Pp sind do . 
1=0 


Making use of the properties of orthogonality of the Legendre polynomials, 
we obtain 





[IFON e!2©) Py(cos6) sin 6 d8 = z(e”) (95.29) 
Equating the real parts of relation (95.29), we find 
sin 26, 
If(0)| cos a(@) P(cos 8) d cos 0 = ne (95.30) 


From formula (95.30) it is clear that the replacement of a by 7 —a leads 
to a change of sign of the left-hand side. To conserve the equality it is 
necessary to reverse the sign of all the phase shifts 6,. Thus the uncertainty in 
the quantity œ leads to an uncertainty in the sign of all the phase shifts. 

If the sign of only one of the phase shifts is determined in an independent 
way, then the relation of 6, to all the other phase shifts becomes unambig- 
uous. The sign of one phase shift (namely, the s-wave one) can be established, 
for example, from the study of the scattering and interference of slow par- 
ticles. It should be pointed out that although the calculations show us the 
possibility of determining the scattering amplitude, the solution of the 
integral equation (95.28) is a difficult problem. 


§96. S-matrix and perturbation theory 


If the total Hamiltonian can be written in the form of a sum H= Hy +H 
where Ay describes the behaviour of non-interacting particles and H’ their 
interaction, then to find the explicit form of the S-matrix it is convenient to 
make use of the interaction representation. The wave function in this repre- 
sentation is defined by eq. (49.21). The operator W(t, to) defined by formula 
(49.1), which transforms the wave function at a given instant of time fg into 
the wave function at the instant of time ¢, can also be defined in the inter- 
action representation. That is, writing 


§96 S-MATRIX AND PERTURBATION THEORY 409 





v(t) = V(t,t9) Ato) (96.1) 
and substituting into (49.21), we find 

; aV(t,t9) = K 

ifi s = Hin (t) Vit,to) > (96.2) 

W(to,to)=1. (96.3) 


System (96.2) and (96.3) can be compared with the integral equation 
a T a E 
Wt.to)= 1-5 f dr'Ai atto) - (96.4) 
to 


Integral equation (96.4) can be solved by a method of successive approxi- 
mation 


a t 
1 a") 
YG) = lisa. Jf 8 Ain) + 
i 2 t i R A 
i (-2) ff at, f atait ina) + (96.5) 


The general term of the series is of the form 


tn-1 


t t 

D ią” J y at Sr 

yr) = (- i) fay f ata fO Aine) Bila) ~- Aint(tn) dtn. 
a Eo FA (96.6) 

It is evident that the range of integration over the variables fy, t2, ... ty, has 


the order 
i LAS me lin o (96.7) 


In order to simplify the notation and so that one need not follow the order 
of carrying out the integration, it is convenient to symmetrize formula (96.6). 
In the case of a function symmetric with respect to its variables use can be 
made of the formula 


In—1 


b ti 
fan Saz f dtp f (tist) = 
a a 


a 


b b b 
1 
aaa fa an rsty)- (96.8) 
a a a 


aT 





———————————— SS 


410 SCATTERING THEORY Ch. 11 


For the purpose mentioned we introduce the so-called chronological operator 
Ê, which by definition arranges time-dependent operators in chronological 
sequence, i.e. in order of decreasing times (96.7): 


A BONM for A >t, 
PEGDA) -| he ; (96.9) 
M(ty)L(t;) for >t. 
A representation of this operator could, for example, be the expression 


a lLte(t,-t ] —e(t5-t 
p- Ciia (ty ree 
2 2 


where e(x) is the so-called sign function 


fone 
e(x) = 2 = [ 


-1 for x<0. 
By means of the chronological operator we can write 


™-1 


t ty 
dt} dtz... dt, A int (C1 )-- Hilt) = 
ae 


t t t 
1 ps fr fyt 
=g f “5 f drz. f dt, Pint- Hinan) - (96.10) 
— 00 —co —00 


Hence for D(t, =) we find 


Wt,—») = 1 +DE aya a fer fe PH (t1)..-Hig (ty) = 


=i 
ire 

-Pep -7 J Bat). 
—co 


In accordance with the definition of the S-matrix 
S= lim W(t,to) = lim W(t.) = 
f~ oo io 


to~- 


i oo 
=Pexp (-# ff Amta) ; (96.11) 
— 00 


§96 S-MATRIX AND PERTURBATION THEORY 411 


The formula obtained, called Dyson’s formula, allows one to relate the S- 
matrix to the interaction energy Â' (if this last exists). It is accurate in the 
sense that the summation of the entire perturbation series is carried out in it. 

It is easily seen that the first terms of the expansion of the general formula 
for the S-matrix lead to ordinary perturbation theory. 


For ease of calculation we restrict ourselves to first order perturbation 
theory, writing 


= i r ry! 
soma f aH. (96.12) 
—co 


The operator P is in this case identically equal to one. Taking the matrix 
element with respect to states a # b, which are eigenstates of the Hamiltonian 
Ho, we have 


TEESE 
Sba =- h S ar Find ba « 


Passing to the Schrödinger representation and making use of definition 
(49.19), we obtain 


A co 2 TTNA 
si) E -7 Ap dt wieMHot He G/M) Hot a) & 


leah pet em fa i 3 at 
=— > GlH'\a) f exp li (E-E) ] dt = —27if,,5(E,—-E,) - 


We see that sg) is the same as the transition amplitude in the first approxi- 
mation of perturbation theory. 

Analogous, although more cumbersome calculations allow si) to be iden- 
tified with the transition amplitude in the second order of perturbation 
theory. 

In spite of the convenience of the notation of Dyson’s formula, which is 
often used in intermediate calculations, for the actual calculation of the S- 
matrix one has to carry out an expansion in a series and integration by terms. 

An important feature of Dyson’s formula is the fact that it can easily be 
transformed into a relativistically invariant form. Hence it is of particular 
importance in calculating relativistic effects. 


in 





412 SCATTERING THEORY Ch. 11 
§97. Analytic properties of the S-matrix 


As we have already stressed, a number of important results of scattering 
theory which are not associated with the use of a particular form of the inter- 
action potential can be obtained by means of the S-matrix technique. This is 
associated, in particular, with the analytic properties of the S-matrix*. For 
simplicity, we shall in what follows restrict ourselves to the case of elastic 
scattering. Then the elements of the S-matrix are given by formula (95.24). 

As was shown in §35, the asymptotic expression for the radial component 
of the wave function of a particle of energy E=h2k?/2m and angular 
momentum /, regular at the origin, has the form 


Xgl = Ruy = a(k) e729) + b (k) eiri) Ț (97.1) 


In deriving (97.1) it was assumed that the potential decreases more rapidly 
than r~! at large distances. 

Comparing expression (97.1) with (86.2) and taking into account (95.24), 
the matrix elements Se = S; can be expressed in terms of the constants a;(k) 
and b;(k) 

SV(K) = — sar (97.2) 
We shall now formally consider the wave function Xg; and, correspondingly, 
the function S,(k) to be functions of the complex variable k. We shall show, 
first of all, that the function of a complex variable S)(k) should be given only 
in one quadrant and not in the entire plane of the complex variable k. Indeed, 
since the Schrodinger equation does not change under the replacement of k 
by —k, the function X- by virtue of the uniqueness of the solution, de- 
scribes the same state as the function x,;. These two functions can differ only 
by aconstant factor. Replacing k by —k in (97.1), we obtain 


afk) _ bik) 
bk) a(-k)` 

Hence it follows that 
S)(k)=S7\(-k) . (97.3) 








* For a more detailed consideration of the problems touched upon in this section and 
for a bibliography see A.I.Baz, L.B.Zeldovich and A.M.Perelomov, Rasseyanie, reaktsii i 
raspady v nerelyativistskoi kvantovoi mekhanike (Scattering, reactions and decays in non- 
relativistic quantum mechanics) (Nauka, Moscow, 1966). 


§97 ANALYTIC PROPERTIES OF S-MATRIX 413 


Further, we note that, since the Schrödinger equation is real, the function xg, 
also must be the same to within a constant as the function x,);. Hence it is 
again easily found that 


Sk) = (S7(k))7? . (97.4) 


Formula (97.4) is obtained for real k. Carrying out analytic continuation to 
the entire plane of the complex variables k, we have 


Sk) = (S E*N . (97.5) 


Relations (97.3) and (97.5) connect the values of the function S$,(k) given 
in one of the quadrants of the plane of the complex variable k with its 
values at the corresponding points of the remaining three quadrants. From 
relation (97.4) it follows that for real k, IS (K)1? = 1, ie. the phase shift 6, is 
real (Se %1). On the contrary, as is seen from (97.5) and (97.3), the func- 
tion S;(kķ) is real on the imaginary axis, so that the phase shift 6, is imaginary. 

Let us consider the position of the singularities of the function S,(k). 
We assume that there corresponds to the potential U(r) a bound state of the 
particle with energy —E. The bound state is described by the wave function 
Xkoi tegular at the origin and falling off at large distances as ew Ko!” where 
ko = i(2mh-2|EQ|)?. Consequently, the function x; analytically continued 
to the complex plane must fall off for k = kg as ekol7, Hence the relation 
b,(ko) = 0 must be satisfied at the point k = kg. In accordance with formula 
(97.2), the function S,(k) has a pole at the point k = kọ. As follows from 
(97.3), the function S,(k) reduces to zero at the symmetric point lying in the 
lower half-plane, i.e. at k = —kg. Thus we arrive at the conclusion that to each 
bound state there corresponds a pole of the function S)(k) lying at the corre- 
sponding point of the upper imaginary semiaxis in the plane of the complex 
variable k. It should be noted that so-called ‘false’ poles, which do not 
correspond to any stationary state, may also arise on the imaginary semi- 
axis. It can be shown (see ref. on p. 412) that ‘false’ poles do not arise when 
the so-called cut-off radius R is introduced, i.e. when the condition U(r) =0 
is introduced for r>R, where the radius R may be as large as one wants. 

We note that the function S,(k) cannot have poles in the upper half-plane 
lying anywhere off the imaginary axis. Indeed, to such a pole there would 
correspond a complex value of the energy of the bound state, which is im- 
possible. 

The function S)(k) may also have poles in the lower half-plane, and there 
they may also lie off the imaginary semiaxis. As follows immediately from 
relations (97.3) and (97.5), these poles must be situated in pairs symmetric 
with respect to the imaginary semiaxis. In the upper half-plane there corre- 





414 SCATTERING THEORY Ch. 11 


spond to these poles zeros of the function S/(k). It is easily seen that to 
poles lying in the lower half-plane there correspond wave functions ex- 
ponentially increasing at large distances. Such wave functions cannot, of 
Course, correspond to a bound state. It can be shown that to poles in the 
lower half-plane there correspond quasi-stationary states of the system, i.e. 
states which decay in the course of a certain finite time T. 

We find the residue of the function S,(k) with respect to the pole to 
which there corresponds a bound state with energy £ = —E or the value 
k=ky= i(2mh-?|Eg|)?. Denoting this residue by c}, we write the function Sı 
in the neighbourhood of the point k = kọ in the form 

ci 


S; KEI (97.6) 





The quantity c} is connected by a simple relation with the amplitude of 
the wave function corresponding to the stationary state with energy £ = —E. 
In order to establish this relation we write down equations satisfied by the 
function x,, and by its derivative with respect to energy 


n Oe ( Al(I+ >) 
sep Xjj= 0 
one 2mr2 ) “* 


ze) "2m ( hi(+1 ee 2m 
Se) A (ty ep aA ae il 
( ðE h2 2mr2 / ðE fe 





We shall assume the function xz; to be normalized by the condition 


co 


if Xgl? dr=1. 
0 


Multiplying the first equation by dx,,/0E and the second by Xxy> Subtracting 
one from the other and integrating with respect to dr, we obtain 


, ôXki (Ge A _ 2m - 2 
Xkl 3g Xki ae} > ize Xx dr . (97.7) 


We apply this relation for Æ = —E£, and roe; Expanding the functions 
a(k) and b;(k) near the point k= kọ in a series and rename the constant 
ak) =aj(k)= Aji, bik) =6(k-ko) . (97.8) 


Making use of these expansions and of relations (97.7) and (97.1), we obtain 


§97 ANALYTIC PROPERTIES OF S-MATRIX 415 
saioak (97.9) 
precon i 


Substituting expression (97.9) into (97.2), the residue ç; at the point k = Kg is 
c, = idp). (97.10) 


Thus we have related the value of the residue ç; to the amplitude A} in the 
asymptotic expression of the wave function x, )= Aje Kol" of the bound state. 

The study of the behaviour of the scattering phase shifts 6)(K) and, conse- 
quently, also of the function S;(k) = e2!8*), and the extrapolation of these 
results to the complex region make it possible, on the basis of (97.10), also 
to draw definite conclusions concerning the wave function of the bound state. 

The analytic properties of the quantities S)(K) make it possible to obtain 
important relations which must be satisfied by the scattering amplitude. These 
relations are called dispersion relations. Dispersion relations for the amplitude 
of the scattering at zero angle /{0, k) are the simplest and at the same time 
the most important. Dispersion relations establish the connection between 
the real and imaginary parts of the scattering amplitude f(0,k) 


f(9,k) = Re f(9,k) +i lm f(9,k) 


and are based on the use of the Cauchy formula in the theory of analytic 
functions. 

Suppose that F(k) is a certain function which is analytic in the upper 
half-plane of the complex variable k and has simple poles on the upper 
imaginary axis. Let us consider the integral 


[eee 
a k'-k 


taken over the contour shown in fig. V.28. 


Fig. V.28 





EE ee 








416 SCATTERING THEORY Ch. 11 


The integral is determined by the sum of the residues of the integrand. 
These residues are taken at the point k’ =k and at the points k’= ky, ky, ..., 
where the poles of the function F(x) lie. If the function F(A) tends to zero 
sufficiently rapidly as |k| >œ, then the integral over the upper semicircle is 
equal to zero. We then have 


oo nag’ Res F(k„) 
: AOE =2 zri (FOD E epi e). (97.11) 


Here Res F(k,,) denotes the residue of the function F at the point k' ie 
Now let the imaginary part of k tend to zero, so that k tends to the point ky 
lying on the real axis. In this case 


oo , ' oo 1 
F(K')dk' _ P F(k')dk' 
if Kk P f Ek +F). (97.12) 


Here P denotes that the integral is understood in the sense of the principal 
value 


d O dya dk’, f F(k')dk 
B ees Oe, 


and the second term on the right-hand side of (97.12) arises from the inte- 
gration over a small semicircle around the point k = ko. 

Based on the results (97.11) and (97.12) we obtain the dispersion relation 
for the amplitude of the scattering at zero angle f(O, k). This amplitude is 
connected with the matrix elements S; (95.24) by relation (86.11) 


paral. 
SOK) = 55 Da \(S)-1) . (97.13) 


From this expression it follows that the poles of the function S)(k) are 
also the poles of the function f(O, k), and the function f(O, k) has no other 
poles. The point k =0 is not a pole at all, since for k > 0,5 > 0, S; > 1 (see 
§86). Thus the function f(0, k) is analytic in the upper half-plane of the com- 
plex variable k and has poles on the upper imaginary semiaxis. Dispersion 
relations for this function are easily obtained if one substitutes into relations 
(97.11) and (97.12) the function F(k) in the form 


F(k)= f(0, k) — f (0, =). (97.14) 


§97 ANALYTIC PROPERTIES OF S-MATRIX 417 


From the amplitude f(0, k) one subtracts its value for k > °° in order to 
reduce the integral over the large semicircle to zero (fig. V.28). For k > œ% 
the term with the potential U(r) can be neglected in the Schrödinger equa- 
tion. The solution of such an equation has the form of a plane wave. Sub- 
stituting such a solution into (83.10), we obtain 


FOS =- Uar. (97.15 
2nh2 J ) 


Expression (97.15) represents the scattering amplitude in the Born approxi- 
mation (see §84), i.e. f(0,°°) = fg. Substituting (97.14) into relations (97.11) 
and (97.12) and taking into account that the integral with the Born ampli- 
tude reduces to zero, we obtain 


Res f(0,k,,) 
Kn =k 





F f(Ok')ak' 
ODi [APE 07.16) 


n,l 


In this relation k is assumed to be real, and the index zero is dropped. The 
points k=k,, lie on the upper imaginary semiaxis and correspond to the 
poles of the function S,(k). The summation in (97.16) is carried out over all 
bound states. The expression for the residue of the function S; in terms of 
the amplitude of the corresponding bound state is given by formula (97.10). 
Taking into account (97.13), we have 


1 
Res f(0, Ky) = RAI) . (97.17) 


Relation (97.16) can be rewritten in a somewhat different from, if it is taken 
into account, according to (97.3) and (97.4), that for real k S,(—k) = Sj (k) 
and, correspondingly (see (97.13)), that f(0,—k) = f*(0, k). Hence the integra- 
tion in (97.16) can be carried out only over positive values of k, having 
physical meaning. Equating the real parts on the left and the right in (97.16), 
we have finally 





2 
Pm f(O,k')k' dk’ op AnD (21+1) 


2 
Re f(0,k)= fp +P PE 
0) = fg TG k'2 — k2 klíku) 


nl (97.18) 


The imaginary part of the amplitude Im f (0,kķ) involved in the right-hand side 
of the equation can be expressed, according to the optical theorem (see 
(91.11)), in terms of a physically observed quantity; the total scattering cross 
section o(k). Hence also the real part, Re f(0, k), according to (97.18) can be 
expressed in terms of physically observable quantities. Dispersion relations are 





i 





418 SCATTERING THEORY Ch. 11 


at present widely used. In particular, by means of them one can immediately 
remove the ambiguity (noted in §95) in the choice of phase shifts for a 
known law of scattering, ie. for a known cross section. We stress that dis- 
persion relations are based on as general a property of the S-matrix as its 
analyticity which results from the causality principle. 


§98. Time reversal and the principle of detailed balance 


Let us consider the properties of the S-matrix associated with the sym- 
metry of the Schrödinger equation with respect to time reversal. We have 
already touched upon this question in §6 and shall now consider it in more 
detail. 

The symmetry with respect to time reversal means that there exists a 
solution W,.y(x,) of the ‘reversed’ Schrodinger equation expressed in terms 
of the function W(x,—r). If the operator Í does not depend explicitly on time, 
then 


in fs OWE CKD) = Fey", —1). (98.1) 


For H* =H eq. (98.1) is the same as the initial equation (27.7), and the 
function W*(x,—f) describes the process reversed in time (see (6.9)). In the 
more general case (a charged particle in a magnetic field) we have to set 


Vev& t) = VY" (x,t), (98.2) 
where V is a certain operator. Operating on eq. (98.1) from the left with the 
operator V, we obtain the equation for the function Wey 


WAED = 
in Lar eae VA* VW e&t) - (98.3) 


This equation is the same as the initial Schrödinger equation (27.7) under 
the condition 


VH* = HV (98.4) 


From the Hermitian property of the operator H it follows that the 
operator V must be unitary, i.e. V-! = Pi. To the law of transformation of 
wave functions (98.2) there corresponds a definite law of transformation of 
arbitrary Operators Ê. This law can be found by the usual methods (see §46, 
48.49). 

A lack of generality arises in the case given only in connection with the 


§98 TIME REVERSAL. PRINCIPLE OF DETAILED BALANCE 419 


fact that the operator V operates not on the function y but on the function 
W*. We shall find the operator Ëa (reversed in time) proceeding from the 
requirement that the matrix element of the operator F taken with respect to 
the functions W,,, must be the same as the matrix elements of the operator 
Fey taken with respect to the functions Y(x,—t) 


WreylFlW rey? = W(t) Frey |W (0) - (98.5) 


Making use of relation (98.2), we obtain 
WrevlElW rey) = Vb" (DIF Py (y = w (—a Py 1) . 


Hence it follows (see (17.3)) that 


SE (98.6) 
where Be denotes the transpose of the operator Faye As is easily seen from 
(98.4) and (98.6), the operator H is invariant under time reversal, i.e. 
Hy =H. Here we have made use of the condition of hermiticity of the 
Hamiltonian H=H*. Relation (98.6) can serve as a basis for finding the 
operator V. Indeed, it is natural to require that the quantum operators 
transform under time reversal in the same way as the corresponding classical 
quantities. Quantities such as energy, coordinates, electric field strength and 
so on are invariant under time reversal. The corresponding operators also 
must be invariant. Velocity, momentum, angular momentum, magnetic field 
strength and so on change sign under time reversal. The corresponding 
Operators must have the same property. For example, the relations 


a 


a E Prey =—D. Lie =—L (98.7) 


must be fulfilled. The spin transforms as the angular momentum, i.e. the 
following relation must be fulfilled: 


Sey =—S.- (98.8) 


Let us consider, for example, a particle with spin 4. Proceeding from 
relations (98.8) it is easy to find the operator V, operating on the spin 
variables under time reversal. Making use of expression (98.6) and taking into 
account the form of the spin operators (60.15) and (60.16), we have 


Vis. V,=—5, , 3,07, Si Vis V.=-3,. (98.9) 
From these relations by means of (60.12) we easily find 
K= ioy: (98.10) 


(We have chosen the phase factor in such a way that the operator V, is real.) 








420 SCATTERING THEORY Ch. 11 


For the motion of a particle in a magnetic field the operator must involve 
changing the direction of the magnetic field (or of the vector potential A) to 
the opposite direction. Taking this fact into account, relation (98.4) has the 
form 


o, H*(—A) = H(A)o,, . (98.11) 


It is easily verified that the Hamiltonian H (see (63.3)) satisfies this relation. 
The invariance of the Schrödinger equation under time reversal means that 
one can always find an operator V satisfying condition (98.4) (for more 
details see the reference below*). However, the discovery in 1964 of the 
anomaly in the decay of K-mesons shows that under certain conditions the 
principle of time reversal may apparently be violated. 

From the invariance of the Hamiltonian H under the replacement ¢ > —r 
there results the invariance of the S-matrix, i.e. (see (98.6)) the following 
relation holds: 


HSPs. (98.12) 


The validity of this relation is easily checked, taking into account (98.4), 
for the operator V(t, to) (see (96.5)). Since the operator § is defined as the 
limit of the operator V(t,to) (see (96.11)), it also satisfies relation (98.12). 

Based on relation (98.12), it is easy to establish the relation directly be- 
tween the matrix elements of the S-matrix for the direct and inverse reactions. 
We denote by W, and Y, the wave functions of the initial and final states of 
the system. Then, taking into account (17.3), (98.2) and (98.12), we have 


Wal IY = WEISIVE) = WIP Sw = Ws ISIWE) = WaeISl¥p*) , (08.13) 


where Y,» and Wps denote the ‘reversed’ wave functions of the states a and b. 
Thus the following equality is fulfilled: 


Shaq = S7*b* . (98.14) 


Relation (98.14) establishes the connection between the matrix elements 
of the S-matrix of the direct and ‘reversed’ processes. The states Y,» and 
Wpa differ from the states Y, and Wp by the sign of quantities such as veloci- 
ties, momenta, angular momentum components, spin components and so on. 
Relation (98.14) or the equivalent relation (98.13) is called the reciprocity 
theorem. On the basis of this theorem a relation can be established between 


* A.M.Baldin, V.I.Goldanskii and I.L.Rozenthal, Kinematics of nuclear reactions 
(Pergamon Press, Oxford, 1961). 





§98 TIME REVERSAL. PRINCIPLE OF DETAILED BALANCE 421 


the cross sections for direct and inverse reactions (principle of detailed 
balance). 


Let us consider the reaction 


atA=br+B. 

We denote by jg, Mg J4» MA» Ip Mp. Jp, Mp respectively the total angular 
momenta and their components of the particles taking part in the reaction. 
According to (95.14), the cross sections for the direct and inverse reactions 
expressed in terms of the matrix elements of the S-matrix have the form 





do, 4n2 R 
ID, ra Kip Mp Jpg ny lS Maja mang? , (98.15) 
a 
dg, 2 
Ge? nr y N; 3 
Io a laoMala Man Slip Mp ig MmpgnpA? . (98.16) 
b 


Since the momentum vector of the relative motion of the particles in the 
final state is directed away from the centre of mass, it is assigned a minus 
sign. 

The relation between these cross sections cannot be written directly, since 
the reciprocity theorem relates the cross section for the direct process to 
that of the ‘reversed’ process which differs from (98.16) by the change of the 
signs of the angular momentum components mMm, Mg, Mp, Mpg into the 
opposite signs. However, one can write the relation between averaged cross 
sections, i.e. cross sections summed over the components of the angular 
momenta of the final states and averaged over the components of the angular 
momenta of the initial states. Such cross sections no longer depend on the 
angular momentum components, and for them the reciprocity theorem 
(98.14) gives 











E E yj p41) (98.17 
2 a MAA) a9, 2 et ia) ag nt) 
b a g 

where 
do do 
ba _ l ba (98.18) 


dQ, Qjg*VQi4 +1) ma dQ, 
a 


mp MB 
and 











422 SCATTERING THEORY Ch. 11 
da, do 
car = Gj DG +) T ga 
a b B mama ‘a 
mp mp 


A relation analogous to (98.17) can also be written for the total cross 
sections 


KP Qjgtl M24 +1)0pa = kg Zip +1 p+ Gap - (98.20) 


We note also that the relation between non-averaged cross sections for the 
direct and inverse reactions can be established within the framework of 
applicability of perturbation theory: 


I doa _ 1 doz, 


[2 0, 42 22, (98.21) 
a 


Indeed, in this case the transition probability and, consequently, also the 
cross section for the process, is determined by the square of the modulus of 
the matrix element of the perturbation Hamiltonian H;,,, for which, by virtue 
of hermiticity, the relation |Hpygl* = Hp? is fulfilled. From this equality 
there results relation (98.21). 





12 





The Method of Second Quantization 
and Radiation Theory 


§99. Second quantization for systems of bosons and fermions* 


One of the important formal mathematical methods often used in the 
quantum mechanics of a system of many particles is the so-called second- 
quantization method. 

In this method a transition from the coordinate representation of the wave 
function to new variables is carried out. As new variables the numbers of 
particles in a given quantum state are chosen. Thus the system of particles is 
now characterized not by defining the wave function: W(&, ,£9,...,.E,,0) but by 
defining a new function C(11),,...,0), Where 7}, ng, ... are the numbers of 
particles in the Ist, 2nd and so on states. We shall call the quantities 7}, 9, ... 
the occupation numbers. 

The quantity 


[C( parong t) (99.1) 


gives the probability that at instant of time ¢ there are n} particles in the 
first state, particles in the second state and so on. The second-quantization 
method turns out to be very convenient for those systems in which the 


*In this section we follow L.D.Landau and E.M.Lifshitz, Quantum mechanics 
(Pergamon Press, Oxford, 1965). 


423 





ne 
ee - 





ee ee =. 


424 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


number of particles in a given state changes, and the production and dis- 
appearance of particles of a given kind occurs (for example, in the emission 
and absorption of photons, or in the $-decay of nuclei). The transition from 
the ordinary description to second quantization is an example of a trans- 
formation from one representation to another. 

Let us formally consider a system of non-interacting identical particles. 
We shall first assume that the particles obey Bose statistics. 

We denote by Y; (£), ¥(é), -~ ¥,(&) the whole set of orthogonal and 
normalized wave functions of an individual particle forming a complete 
system of functions chosen in an arbitrary way. The index k denotes the set 
of four quantum numbers characterizing the state of the particle. We pass to 
the representation in which the occupation numbers n and not the coor- 
dinates, £;, of the particles are chosen as independent variables. 

In the new representation the basis functions (see §65) are the sym- 
metrized and normalized products of the wave functions W,(é;) of the 
individual particles. Formula (65.5) for the general case where 1, particles 
are in state Y}, no particles in state W3 and so on assumes the form 


Vanya ng, Gi £,--6y) 7 


Jeman 


E E aE OD 


The summation is carried out only over all permutations of different indices 
ki, ka, 

We introduce the operators â} and @, which act on the new variables, the 
occupation numbers in state k. We define these operators by the formulae 


OAU F (ng)? Wary pool > (99.3) 


A a 1 

EUR = (n+l)? Wr pent lye : (99.4) 
The operator â, reduces the number of particles in state k by one, i.e. it 
replaces ną by n,—l. The operator â} increases this number by one, i.e. it 
replaces ną—l by nz. It is obvious that the consecutive application of the 
operators â; and a does not change the number of particles in the kth state, 
ihe 


GLY py, ston FW ny theo 3 (99.5) 


The matrix elements of the operators @, and â} are of the form 


Oihar Me goed gO NMgr) = A nye — Ange = np > (99.6) 


§99 SYSTEMS OF BOSONS AND OF FERMIONS 425 


Oty Mh yeees Mgt A UM My yeep seed = (aj, eines (n, +1 hi (99.7) 
kanyng = knyng * (99.8) 


In accordance with their meaning, the operators a, and â} are called res- 
pectively the annihilation and creation operators of a particle in the kth 
state. The operator aja, is called the operator of the number of particles ng 
in the state k. 

We have already encountered operators similar to the operators @, and aj. 
in §50 in considering the problem of the harmonic oscillator. It is easily seen 
that the operators @, and â} satisfy the commutation relations 


â,âj — â] âp = õp 
Ga, -â âp = 0, (99.9) 
ajaj —ajaj =0. 
We shall show how the ordinary operators acting on a wave function in the 
coordinate representation can be expressed in terms of the creation and 


annihilation operators of particles, ie. in the second-quantization represen- 
tation. 

Let us consider the operator L(é;) acting on the coordinates of the ith 
particle. The coordinates are understood to include the spin coordinates. 


Since all the particles are equivalent, we introduce the operator Ly = 
pn (Es): Let us find the expression for it in the second-quantization repre- 


sentation. We obtain the matrix elements of L} by means of the basis func- 
tions (99.2). 
We have by definition 


Hin cantina Wer rite alts) = 


N 


= et ALE anise: (99.10) 
i= 
Let us consider one term of the sum over the particles 
(1 jones Myo oe Epin E 
5f U nel EN „ dE -dy - (99.11) 


(Summation over spin variables is implied.) The operator L(é;) acts only on 
the variables of the ith particle. Hence we can write 








426 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


i 


z nhng.. A 
LEW ny, np (S) D; Vey (E1)--VeyEnw£ Eve, Ed - (99.12) 

Multiplying (99.12) by the function Uo and integrating we note, first of 
all, that the integrals over all variables except £; contain only the products of 
wave functions. 

By virtue of orthogonality of the latter all integrals involving factors of the 
form Y7 (E)W(é1), ie. containing the products of the wave functions of the 
particles (except of the ith particle) referring to different states, will reduce 
to zero. 

In the double sum over permutations (99.11) only those terms which 
contain the products of the wave functions of the particles (except the ith 
particle) referring to the same states differ from zero. The integral over the 
variables &; is of the form 


LED =f VEDLE E) 48; - 
This means that for /#< a transition of the particle takes place from the Ath 
state into the /th state. Consequently, the number of particles in the kth state 


decreases by one, and in the /th state increases by one. We denote the corre- 
sponding matrix element by 


"gln Enn- 1) (99.13) 
(the operator is diagonal with respect to other occupation numbers and we 


do not write them down). The functions involved in the matrix element are 
of the form 


ny!...(ap—1)!...m7!... \4 í z 
e i ) 2L Yki (E; ).- Whey Ey) > 


* 
menka 


ny!-Ng!a(n—1)!.. \4 
Vaan (2 ) D Vp, 1)-Yey Ev) : 


By virtue of the orthogonality of the wave functions integration over the 
coordinates of all particles gives (taking into account the permutations of 
N—1 particles excluding the ith particle) 








§99 SYSTEMS OF BOSONS AND OF FERMIONS 427 
a U =) lee rey eee ny!ng! (n1)! 4 
orgi mEn =( Ni S N x 
(W-1)! ae 
os ny!...0%,—1)!...@77—1)!... LEM 
gn)? = 
= wy ED ix : 


Since the operators LE ), L(E>), ... differ from each other only in the number 
of particles on whose coordinates they act, all matrix elements differing in 
the number of particles are equal to each other. Hence for the matrix element 
(99.10) of the operator L we can finally write 


N 
(rg —1 rly nen) =n- [2 Reopen- D= 
i=l 
= Moa —1 n EEn nD) = gn? (LEM (09.14) 


In the case where the diagonal matrix element is considered, i.e. where the 
distribution of the number of particles over states does not change, we have 
analogously 


Gy Ma dL inna d= 2 ngelek - (99.15) 
k 


We now introduce the operators ât and @ into formulae (99.14) and (99.15). 
Then the operator L} can be written in the form 


LZ, = 2 (LE) na] âk - (99.16) 
k,l 


Indeed, the matrix elements of this operator are, by virtue of (99.6) and 
(99.8), the same as the matrix elements (99.14) and (99.15). 

An analogous result can be obtained in the same way for operators which 
act on the coordinates of two particles ; and $j. 

The operator 





428 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 
=D LE) 
ij#l 


is expressed in the second-quantization representation by the formula 


Ly= © GmiLGz'\ikpraj i âpâ, . (99.17) 
k,p,l,m 


where the matrix elements are equal to 
U mEEE IKP = [V7 Oh, EEE WEY Ed de (99.18) 


By means of the general formulae (99.16) and (99.17) one can write the 
Hamiltonian of a system of particles in the second-quantization representa- 
tion. In the case of a system of non-interacting particles in a given external 
field we have 


ù+ 


N N 

A 2 
=D (FUE) = (- iy Puc) (99.19) 
i=] i=l 


i=] 


where Ulki) is the potential energy of the ith particle in the external field, 
and T; is its kinetic energy operator. Operator (99.19) is evidently a particular 
case of the operator ie Correspondingly we can immediately write operator 
(99.19) in the second-quantization representation 


A= V (A)_aj a, - (99.20) 
k,l 


Choosing as W% the eigenfunctions of the Hamiltonian A, of an individual 
particle, we have 


py =f Vi ©) HOY _O dE = Ey Sy + 


where £,, is the energy of the particle in the kth state. 
Hence, finally, 


A= 2D) E, apa, . (99.21) 
k 





-~ 


EE 


§99 SYSTEMS OF BOSONS AND OF FERMIONS 429 


The energy of a system of particles is, by virtue of (99.8), equal to 
VOU ne (99.22) 
k 


If the eigenfunctions of the operator it corresponding to the eigenvalues 
Ex are chosen as W,, then (99.20) is rewritten in the form 


A= 2 epajay + LI ajay [VFUOV, a . (99.23) 
k k,l fi 


In the case of a system of particles between which there is a pair interaction 
the interaction energy operator has the form 524; W(E;,E)). Making use of 
(99.17) we write the Hamiltonian in the second-quantization representation 


D mwka] iÂ, (99.24) 


A A xa 1 
A= 27 Hy] a, + 5 
kl = 


k,p,l,m 


or, taking as the functions y% the eigenfunctions of the operator H; 


A= De ata, ++ D dmiwikpyajat, apa, - (99.25) 
k ~ k,p,lm 


We note that the pair interaction (the last term of formula (99.24)) has an 
obvious interpretation. The interaction can be treated as the collision of two 
particles which are in the pth and kth states. After the interaction they make 
a transition to the /th and mth states. 

It is useful to note that formula (99.20) can be obtained by means of the 
following formal method. In the expression for the mean energy (65.9) we 
replace the wave function by the operator in the space of occupation numbers 
defined as 


WE) > VE)= 2 àY E) (99.26) 
k 


and correspondingly 


We te= Laj ye. (99.26') 
l 








430 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


Then on the right-hand side of (65.9) we have 


SV @AVO a > D [aj v7 OHA, 0, de = Viaje Ay . 
kl kl (99.27) 

Comparing (99.27) and (99.20) we see-that when the ordinary wave 
function is replaced by the operator the right-hand side of (65.9) is the same 
as (99.20). This means that in this case H can formally be replaced by the 
operator Ê in the second-quantization representation. 

The name second quantization is due to the replacement of the wave 
function y by the operator wv. In second quantization not only are all 
mechanical quantities replaced by quantum operators (ordinary quantization) 
but the wave function itself is also quantized, i.e. replaced by an operator. 
Although second quantization is a formal method, it turns out to be very 
useful in a number of cases. 

The Hamiltonian of a system of particles interacting in pairs can also be 
obtained easily in an analogous way. For this we again replace the functions 
y and y* in formula (65.8) by operators (99.26). Then, in correspondence 
with what was said above, we make the replacement H > H, where His the 
Hamiltonian in the second-quantization representation. 

After the replacement we obtain formula (99.24). 


All the results obtained so far apply to bosons. It can be shown* that 
formulae (99.20) and (99.24) remain valid also for a system of fermions. 
However, the operators a, and aj can then no longer satisfy relations (99.9). 
Indeed, for the operators a, and âļ defined by formulae (99.9) the eigen- 
values of the product aja, ‘are equal to arbitrary positive integers ng. For a 
system of fermions the occupation numbers can be equal only to zero or one 
in accordance with the Pauli principle. The operators a, and â} must now be 
defined in such a way that the eigenvalues of the operator apa, are equal 
either to zero or to one, ie. 


(ay ay. =e (99.28) 


We shall show that conditions (99.28) are fulfilled if the operators @, and âj. 
satisfy the following anticommutation rules: 


G,a} +â] a, =5,), (99.29) 


* See L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon Press, Oxford, 
1965). 





§100 QUANTUM MECHANICS OF THE PHOTON 431 


Ga, +â,â, =a.aj +ajap =0. (99.30) 
For this we convince ourselves of the fact that 

@a,)? = aja, . (99.31) 
Indeed, we evaluate the left-hand side and making use of (99.29) we obtain 

@}â,) =a}â,â}â, = aja, (1a, a) = aa, — Ga, 4,4}, =a)a, , 
since â? = 0, which follows from (99.30). 

Taking the diagonal matrix elements of relation (99.31), we find ng S 
This equality can be fulfilled only for n, =0 and ng = 1. One can find the 
explicit form of the matrices @, based on relations (99.30). Since the numbers 
ną take on only two values O and 1, the operators da, and @} are two-row 


matrices with respect to these variables. We shall present the corresponding 
matrix elements without derivation. They are 


k-1 
Gn- Co D CA (99.32) 
l=] 


All other matrix elements are equal to zero. As a result of the multiplica- 
tion of the quantities 1—27}, where /=1, 2, ..., k—1, either +1 or —1 is 
obtained, depending on the value of the occupation numbers of states pre- 
ceding the given state. 

Hence it is clear that the numbering of states 1, 2, ..., k, chosen initially, 
must not be changed. 

The Schrödinger equation in the occupation number representation, where 
the Hamiltonian is given by formula (99.24), involves the law of conservation 
of the total number of particles (see §7). However, the introduction of the 
operators âl and @ describing the absorption and production of particles 
allows one in a corresponding generalization also to investigate processes in 
which the number of particles of a given kind is not conserved. 


§100. The quantum mechanics of the photon 


The experimental establishment of the quantum or corpuscular nature of 
light was a spur to the creation of quantum theory as a whole. 

On the other hand, the construction, as a consequence, of the quantum 
theory of the electromagnetic field has been one of the most notable successes 
of quantum theory. 






——— 






Ne a PO =e ez as e: 


< 


432 SECOND QUANTIZATION AND RADIATION THEORY Ch, 12 


Light quanta or photons are elementary particles whose distinctive proper- 
ty is the fact that their rest mass is equal to zero. Hence they a'ways move 
with the velocity c in vacuum. This fact leads to certain important features of 
the method of describing their behaviour. Namely, the relation between the 
energy and momentum of the photon is given by the general formula 


€=cp =hick. (100.1) 


If the momentum of the photon is replaced by its momentum operator, then 
the energy operator in the momentum representation has the form 


H=cp=hck . (100.2) 


Correspondingly the Schrodinger equation can be written in the momentum 
representation as 


Boye 
eae HV pe (100.3) 


where Yp is the wave function of the photon in the momentum representa- 
tion. n 
The operator H is related to the photon energy e€ by the general formula 


e=[upiy, dp=nc | pk, dp . (100.4) 


On the other hand it can be assumed that there corresponds to a photon an 
electromagnetic field over all space. Its energy is 


pl tH? 1 fm 
e=[ z V= fea. (100.5) 


It is natural to identify the energy of the photon with the energy of the 
electromagnetic field. Both field vectors satisfy Maxwell’s equations, which 
in a vacuum are reduced to the form 


and analogously for the vector H. 
Expanding Ein a Fourier integral 


E(r,1) = | E(k,t) ei dk 


we have 


2 
GEE) EZEK =O, 
ar2 





§100 QUANTUM MECHANICS OF THE PHOTON 433 


or 





pewa J KEG, |] EMCO D KEK, )|=0. (100.6) 


By virtue of the fact that the field is real the following condition must be 
fulfilled: 


E(k) = E(-k). (100.7) 


In place of the Fourier component E(k,t) we introduce the new function 
f(k,t) defined by the relations 


E(k,t) = Mk) [f(k,)+f"(—k,)] , 
(100.8) 
E(k,1) = —ikN(k) [f(k,1)—f"(—k,1)] 


where N is a factor of proportionality. The dot denotes differentiation with 
respect to time. 

It is easily seen that in such a representation of E(k,r) the condition 
(100.7) is automatically fulfilled. 

Substituting the values of E(k,r) and E(k, t) into (100.6) we arrive at two 
equations 


ESR, SEE (100.9) 


We stress that eqs. (100.9) only represent another form of notation of 
Maxwell’s equations. Multiplying (100.9) by # we obtain 


mots pe. ain 





= pf*. (100.10) 


We see that the function f(k,r) satisfies an equation which is in form identical 
with the Schrödinger equation. If p is replaced by the operator H, then the 
function f(k,t) must be identified with the wave function of the photon in 
the k-representation. 

The factor of proportionality X, which has so far remained arbitrary, can 
be defined from the comparison of (100.4) and (100.5). 

Substituting expressions (100.8) into (100.5), we have 





434 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 
1 : 
= : 4 i(k+k’)-r 'dVe= 
€ aS Ek) Eck’,a)e dk dk’ dV 


aul ; ALA OA 
= = J Ek.) Ek’,t) dk dk feiktk-rav = 





a eoi E(k,z) - Ek’,t)ô(k+k’) dk dk’ = 


= 27? f E(k.) - E(-k,t) dk = 472 f NYK) f(k): f*(k)dk . 


For N = (ck/4n?)} the energy of the electromagnetic field and the energy of 
the photon turn out to be identical. Thus in the k-representation the photon 
is described by the wave function 


Vik,1) =f(k,r) . 
Then the following condition is fulfilled: 
fetfak=1. 

In this case the Maxwell equations for the electromagnetic field of a mono- 
chromatic wave tum out to be identical with the Schrödinger equation for 
an individual photon. Introducing the explicit dependence on time, we can 
write 


W(k,t) = fok) e`iwt =fo&k) e-(i/ñi)et 


By virtue of Maxwell’s equation V- E=0, the amplitude in k-space satisfies 
the condition kfọ(k)=0. We shall not dwell on the problems of normaliza- 
tion of the wave function and on the calculation of other quantum-mechanical 
quantities of photons, for example spin angular momentum, parity and so on: 
we refer the reader to the monograph of Akhiezer and Berestetskii*. 

We confine ourselves only to some remarks of theoretical importance. We 
stress, first of all, that since Maxwell’s equations are relativistically invariant 
so is the Schrodinger equation for the photon. 

This is natural, since the photon always moves with the velocity of light. 

We have found the wave function of the photon in the k-representation 
(or, what is the same, in the p-representation). This wave function has the 
usual probabilistic meaning. However, the wave function of the photon in the 
X-representation, which would allow one to establish the probability of 
localization of the photon at a given point of space, does not exist. 


* A.l.Akhiezer and V.B.Berestetskii, Quantum electrodynamics (Interscience Pub- 
lishers, New York, 1965). 


§100 QUANTUM MECHANICS OF THE PHOTON 435 


For free particles of rest mass mg different from zero the wave function 
in the x-representation is obtained from the wave function in the p-represen- 
tation by means of the Fourier transformation. 

In our case the Fourier transformation gives 


£(r,0) = [fki eik-r dk. 


However, and here lies the fundamental difference between photons and 
particles with 7g #0, the position of a photon can be determined only as a 
result of interaction with charged particles, for example with electrons. 

This interaction is determined by the value of the field vectors Eand Hat 
the point at which the electron is localized. The strength of the field at a 
certain point is defined by the Fourier inversion transformation, i.e. 


E(r,1) = fEtk,1) et dk = + few: [£(k,c)+£*(k,2)] eiKT ak. 


This formula shows that the field strength is not expressed in terms of f(r,r), 
i.e. is not determined by the value of any wave function at the same point of 
space. On the contrary, E(r,t) is determined by the distribution of f(r,t) in 
space. 

Photons have a spin equal to one. However, the definition of spin as the 
intrinsic angular momentum of the particle at rest makes no sense in the case 
of photons. Hence the division of the total angular momentum of the photon 
into an orbital part and a spin part is to a certain degree arbitrary. 

This important last remark is associated with the description of a system 
of photons. 

Photons do not interact directly with each other. The very weak interac- 
tion existing between photons is due to their interaction with electrons of the 
background. Hence the wave function of a system of photons is the wave 
function of a system of non-interacting particles. Photons as particles with 
integer spin obey Bose—Einstein statistics. 

When photons interact with other particles the number of photons changes 
in the processes of emission and absorption. Photons are absorbed and 
emitted one at a time. The interaction of photons with charges can be de- 
scribed by means of their wave function (see the monograph of Akhiezer and 
Berestetskii cited above). However, this interaction is described in a much 
more effective and simple way by means of the second-quantization represen- 
tation. We note that the method of second quantization was itself devised by 
Dirac for just this purpose. 








436 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 
§101. The quantization of the radiation field 


As is well known, the development of quantum theory began with the 
establishment of the quantum properties of the electromagnetic field and the 
creation of a semiempirical theory of light quanta. Hence it is natural to try 
to apply the mathematical apparatus of quantum mechanics to the electro- 
magnetic field. It turns out, however, that the electromagnetic field has a 
number of features which make this a complex problem. The modern 
quantum theory of the electromagnetic field, commenced by the studies of 
Dirac, is based on special methods, in particular on the method of second 
quantization*. 

We recall that in the classical theory of the electromagnetic field in vacuum 
it was shown that a charge-free electromagnetic field can formally be com- 
pared to a mechanical system with an infinitely large number of degrees of 
freedom. 

Expanding the vector potential, A, of the electromagnetic field in terms 
of plane waves and taking the infinite set of amplitudes of the expansion q; 
as generalized coordinates, it was possible to compare the electromagnetic 
field with a certain mechanical system; a set of field oscillators (see §38 of 
Part I). To each of the Fourier components of the expansion A there corre- 
sponds one of the oscillators. Hence the complete set of field oscillators in- 
cludes an infinitely large number and, consequently, the electromagnetic 
field could be compared to a mechanical system with an infinitely large 
number of degrees of freedom. 

We write the Hamiltonian of this system as follows: 


H= Di (p2+w2q2)= DH, , (101.1) 


where H, is the Hamiltonian of the Ath oscillator, p, is the generalized mo- 
mentum corresponding to the coordinate q,, and w, is the corresponding 
frequency. The summation is carried out over all values of frequencies and 
polarizations. 

The quantum theory of the electromagnetic field is based on the assump- 
tion that this analogy can be given a direct physical content. Namely, it is 
assumed that a real electromagnetic field represents a quantum system which 
obeys the ordinary laws of quantum mechanics. The Hamiltonian H is ob- 


* A more detailed exposition of tħe quantum theory of radiation may be found in 
the book of W.Heitler, The quantum theory of radiation (Clarendon Press, Oxford, 
1954). 





§101 QUANTIZATION OF THE RADIATION FIELD 437 


tained from the classical Hamiltonian (101.1) by means of the usual replace- 
ment of mechanical quantities, g:neralized coordinates and momenta, by 
corresponding quantum operators. That is, we replace q} and p} by operators 
satisfying the commutation relations: 


a or lant e NA Ra Pe a > T 
Brp -auba =F Oru > C — IIa =O > P)Py — PuP, =O- 

Since different field oscillators are independent, the operators Pp, and 4, 
referring to different oscillators commute with each other. Then H will repre- 
sent the Hamiltonian of a quantum system. It is advisable, however, to carry 


out the canonical transformation to new variables (see formulae (SO.11)). 
Namely, we write 


Reet a + Pr | 
SVA N wiCcoxmyial” 


at e)a agile Xe | 
A /2 \\n] 7A (wnt 


In the new representation 





(101.2) 


a2 222 ras MA A ata 
By + wka = Nw, G af +4}4,), 
so that 
|| ATA 
H= z D fw, (4a) +a} 4, ) ‘ 
A 


a at . . . 
To the operators a, and @{ there correspond tae commutation relations 
Aa at a ata = 
Ga), — aha By 


â â, —4,4@, =0, (101.3) 


ajal -ataj =o, 
which follow immediately from the definition and commutation relations for 
py and q,. 
The Hamiltonian can be transformed by means of (101.3), writing 
â âl =1+4{a, . 
Then 
A= nw Aâ 4). (101.4) 
Comparing the expression (101.4) for Í and the commutation relations 


(101.3) for the operators @ and ât with the corresponding expressions (99.9) 
and (99.21), we see that they are completely analogous. This means that a 








438 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


free electromagnetic field represents a system of bosons which are usually 
called photons or light quanta. 

To each plane wave in expansion (38.19) of Part I there corresponds a 
photon. The energy of each photon, according to formula (101.4), is equal to 
ftw, . The total energy of the electromagnetic field correspondingly has the 
form 


E= DEn, + D iho, = OEN +E, (101.5) 


where £, =w, and n, is the number of photons with energy £) - 

The second term of formula (101.5), denoted by £o, is called the energy 
of the zero-point oscillations of the electromagnetic field. Formula (101.5) 
shows that if all n, =0, i.e. if there are no photons in the field, then the 
energy of the electromagnetic field is equal to Eg. Moreover, the quantity Ey 
itself is infinitely large, since the sum for £9 involves an infinitely large 
number of positive terms fiw, . 

The presence of the infinitely large constant term in the energy of the 
electromagnetic field has no effect on the processes of interaction of the field 
with matter (the emission, absorption and scattering of light) which will be 
considered in this chapter. In these processes changes occur in the state of the 
electromagnetic field for which only the difference between the energies of 
two states is important. 

When an energy difference is formed the zero-point energy is cancelled. 
Hence, until recently, it was assumed that the zero-point energy could be 
taken as the zero of energy and that it could be formally omitted in all 
expressions. However, as quantum electrodynamics developed it turned out 
that this is not so and that the appearance of the term Eg in the formula for 
the energy of the electromagnetic field has a profound meaning. 

From the point. of view of modern electrodynamics the ‘emptiness’, the 
absence of particles and of photons, is not ‘nothing’ but is a definite state of 
the field, called the vacuum. The existence of the vacuum state and of zero- 
point oscillations with frequencies w, is important in certain interactions 
between the electromagnetic field and electrons and leads to a number of 
observed effects. 

We shall touch briefly upon the problem of the vacuum in §116 and §128. 
In the meanwhile we shall not consider the zero-point energy. 

Let us now find the momentum of a charge-free electromagnetic field. 
According to (38.25) of Part I, we have for the momentum of a plane wave 


ky 


Mrge” 


NTD (101.6) 


§102 INTERACTION OF AN ELECTRON WITH RADIATION 439 


where k, and £, are respectively the wave vector and the energy of the wave. 
If we pass to quantized expressions and replace Æ, by its eigenvalue, then 
we easily obtain 


Pa =7ik, . 


Just as 7iw, represents the energy of an individual photon, Ak, is its 
momentum. We see that between the energy and momentum of the photon 
there is the relation found from the analysis of experimental data even before 
the creation of quantum mechanics 


pIe 
Py a 


From (101.6) it follows, in particular, that the rest mass of the photon is 
equal to zero (see §14 of Part II). The total momentum of the electromag- 
netic field is equal to 


P= QyAk,n, - (101.7) 


It is determined by the occupation numbers 7). 
We now turn to the formulation of the Schrodinger equation for the elec- 
tromagnetic field. It has the usual form 


oy = OW -fu 
iñ ar Hy. 


The wave function of the electromagnetic field is usually called the 
amplitude of the state of the field. If use is made of the Hamiltonian in the 
representation of occupation numbers, then the amplitude of the state of the 
electromagnetic field will also be a function of the occupation numbers ny 


Y = Y(n nyns t) « 


A 


According to the conclusions of §99, the operators at and @, represent the 
photon creation and annihilation operators. When they act on the wave 
function they respectively increase and reduce by one the number of photons 
of frequency w,. The matrix elements of these operators are given by for- 
mulae (99.6) and (99.7). 


§ 102. The interaction of an electron with radiation 


Having carried out the quantization of a free electromagnetic field, we can 
turn to the consideration of a system consisting of an electromagnetic field 








440 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


and particles. We shall assume that there is one electron in the radiation field 
and shall find the interaction between the electron and the electromagnetic 
field. In this chapter we shall suppose that the electron has a velocity small in 
comparison with the velocity of light and that it is described by a non- 
relativistic Hamiltonian. We write the Hamiltonian of the system (radiation 
field + electron) in the form 


~\2 a 
=m oP ~<A) t Arad 
We assume that the scalar potential y is chosen to be equal to zero, and that 
the gauge condition (see (10.5) of Part I) of the vector potential A is of the 
form V-A =0. From this relation it follows that the momentum operator p 
commutes with the vector A, and hence the Hamiltonian A can be rewritten 
as 


z- P e P-Â) + 


2m mce 2me2 





24+ Aad - (102.1) 


The first term of (102.1) represents the Hamiltonian of the free particle, and 
the last term the Hamiltonian of the free radiation field. The Hamiltonian of 
the interaction of the electron with the radiation field, responsible for all 
processes of emission and absorption of photons by the electron, is of the 
form 


A x Omen 
B =— < (p-Â)+ < Â2. (102.2) 
2mc? 


We shall formally assume the electron charge to be the small parameter in 
terms of which the perturbation theory expansion is carried out. In what fol- 
lows we shall in fact sce that the expansion is carried out in powers of the 
small quantity e?/fic = i which figures in the corresponding matrix elements 
and is called the interaction constant. We shall confine ourselves to the con- 
sideration of some simple processes in the first non-vanishing approximation 
of perturbation theory. We have, in §56, obtained the general expressions for 
the probabilities of different processes, and our problem reduces to the 
calculation of the matrix elements of the interaction operator A’ considered 
as the perturbation operator. The expansion of the vector potential is con- 
veniently written in the form (38.19) of Part I: 


A=) (b, A, +b% Až), 
A 


a 


§102 INTERACTION OF AN ELECTRON WITH RADIATION 441 


where 
A, = e, (47c? /V)} eika , (102.3) 


We pass to the quantum operators 
A= Z ĜA +b} Až). (102.3’) 


Making use of relations (38. 20) of Part I, we express the operators by and 
at in terms of the operators ĝ, and py 


~ l DA 
by =z Oaia tiPa) » Oa Jeo, Âa ip). 
N 


Using formulae (101.2) we introduce the operators â, and ât. We then 
obtain 


a = (hi/2w,)? a , bt = (A/2w,)? a . (102.4) 


Comparing with (99.6) and (99.7) we find that the operators by and Bt 
have the following matrix elements different from zero: 


Oty peep yee [ity eet, #1,...) = [AM +1)/2e, ]}? , 


J ; (102.5) 

A E 1 ed = [fin /2w,]? . 
Thus the matrix elements of the vector potential differ from zero only for the 
processes of emission and absorption of one photon. For the operator A2 
involved in (102.2) we have 


A2 = 2 [B D (A, An) BLA, ALBE (As Ay HBL BEAR AS) 
Aa! (102.6) 


From this expression it is seen that the matrix elements of the operator 
A2 differ from zero for two- -photon transitions, i.e. for the emission or 
absorption of two photons or the emission of one photon and the absorption 
of another. 

The term containing the operator A2 as well as the term with operator 
—(e/mc)(p- A) gives a contribution to processes involving two photons, the 
latter term being taken into account in the second approximation of pertur- 
bation theory. 

The vector potential (102.3) describes the state of a photon with given 


‘HE 





442 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


momentum. One can also introduce the concept of the state of a photon 
with given angular momentum. In order to find the expression for the vector 
potential describing the state of a photon with angular momentum and its 
component along the z-axis we should carry out the expansion of the vector 
potential A not in terms of plane waves, but in terms of spherical waves. The 
amplitudes of the expansion must be considered as operators in the space of 
occupation numbers satisfying commutation relations of the same type as 
(101.3). In a state with given momentum the angular momentum of the 
photon does not have a definite value. This corresponds to the fact that the 
plane wave can be written in the form of an expansion in terms of an infinite 
sequence of spherical waves. 

The photon possesses definite ‘internal’ degrees of freedom, since in 
describing its state it is necessary to take into account different possible 
polarizations. 

The ‘internal’ state of a system is usually associated with its spin. However, 
the definition of the spin of a system as its ‘intrinsic’ angular momentum, i.e. 
the angular momentum at rest, is inapplicable to the photon. The photon in 
any reference frame moves with velocity c. 

Nevertheless, it sometimes appears to be convenient to introduce the 
concept of spin also for the photon, writing the total angular momentum 
operator in the form of a superposition of the orbital angular momentum 
operator and the spin operator. In this case it turns out that the spin of the 
photon must be considered to be equal to one. In correspondence with three 
possible spin components s, = 0, +1, one would think that the photon may be 
in three different states with different polarization. However, the condition of 
the transverse nature of electromagnetic waves leads to the fact that actually 
only two spin components are possible, which correspond to the two inde- 
pendent polarization states of the photon. The reader may find a detailed 
consideration of the problems which are touched upon here in the mono- 
graph of Akhiezer and Berestetskii*. 


§103. The absorption and emission of light 
Let us consider the probability of a one-photon transition; i.e. the process 


of absorption and emission. We shall first of all write down the matrix ele- 
ments corresponding to the absorption and emission of a photon of frequency 


* A.J.Akhiezer and V.B.Berestetskii, Quantum electrodynamics (Interscience Pub- 
lishers, New York, 1965). 








§103 ABSORPTION AND EMISSION OF LIGHT 443 


w,- Suppose the electron was in the initial state Y} before absorption and in 
the state Wa after absorption. The transition 1 > 2 proceeds with the absorp- 
tion, and the transition 2 > 1 with the emission of a photon of frequency w,- 
The matrix element of the perturbation operator (102.2) for the transition 
with the absorption of a photon is of the form 


(2,n,—11A'|1,n,)= 


mce 


Pre 2rhin, 
m \ Vo, 


e R 4nc?\? iky 
=—— | ype) E) 2 Onai, mY 14V = 





y [¥3@-e,) eKaty dy. (103.1) 


Analogously for the process of emission of a photon we have 


2nti(n, +1)\4 ; 
e A *(5- -iky-ry, 
<( Vea, ) [vie e) eT MaTy, dV. (103.2) 


(1,n, +1 A |2,n,) =— 
The probability per unit time of the transition with the absorption of a 
photon is given by the formula (see §56) 


2 r 
dW =F 2R In A? (cw) dQ . (103.3) 


Here d&2 is the solid angle element corresponding to the direction of propaga- 
tion of the photon before absorption. We shall assume that states 1 and 2 of 
the electron belong to a discrete spectrum. In this case the final state of the 
system with energy E, belongs to a discrete spectrum, while the initial state 
with energy E| tiw belongs to a continuous spectrum (since the frequency w 
changes in a continuous way). Then the photon absorbed may belong to any 
of the oscillators in the interval of states dwdQ in volume V. The number of 
such oscillators for given polarization per unit volume is given by formula 
(38.23) of Part I. Passing to a continuous distribution of frequencies we shall 
omit the index A where this cannot lead to misunderstanding, or replace it by 
the index k. 

By p(w) in expression (103.3) is meant the number of oscillators in 
volume V corresponding to unit energy and angular intervals for a given 
polarization: 


2 
p(w) = aor (103.4) 











444 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


For the transition probability per unit time taking into account (103.1) we 
obtain 
e2 
dw = 


= anes POD Pm a g (103.5) 
Ti 


The absorption probability is equal to zero for all energies except those which 
satisfy the conservation law 


E, =E; +ħw. (103.6) 


Let us determine the intensity Jo (w) of the incident radiation corresponding 
to the frequency interval dw and angular interval dQ. Since to one oscillator 
there correspond nę photons with given polarization, we have 


w3dadQ 
(2m)3c2 
The total probability is proportional to the intensity of the incident radiation. 
The probability of emission of a photon by the electron is easily calculated in 
a completely analogous way. 

The probability per unit time of the transition with the emission of a 


photon with momentum #tk and polarization e is given by a formula of the 
type of (103.3) 


J0(w) dw dQ = ngřwcph da dQ = nh 


e2w a ik 
dW =——— |((p-e) e#*T),, 2 (ny, +1) dQ . (103.7 
Ta p 12 k ) 
The emission probability is different from zero if the frequency of the 
emitted quantum is equal to 


He =E,-E, . (103.8) 


We see further that the probability of the transition 2> 1 with the emis- 
sion of a photon, given by formula (103.7), consists of two terms. One of 
these is proportional to the intensity of radiation (to the number of photons 
nę) existing before the emission. The initially existing electromagnetic field 
acts on the electron, favouring its transition into a new state with the emission 
of an additional photon. This is called stimulated emission. The existence of 
stimulated emission was first pointed out by Einstein before the creation of 
the modern quantum theory of radiation. The second term of formula 
(103.7) does not depend on the intensity of the initial radiation and also 
ensures the possibility of emission in the case where before the emission the 


§103 ABSORPTION AND EMISSION OF LIGHT 445 


electromagnetic field was not excited (the number of photons 7, = 0). Emis- 
sion of such a type is called spontaneous emission. 

From the comparison of formulae (103.5) and (103.7), taking into account 
the hermitian property of the matrix elements, it follows that for the ratio of 
the probabilities of emission and absorption of a photon one can write 


dWemiss _ ny +1 


wea nk 





(103.9) 


We shall see in §12 of Part VI (Volume 4) that it is easy to obtain from 
(103.9) the Planck formula for the intensity distribution in black-body 
radiation. 

We shall now show that only electrons in bound states can absorb and 
emit photons. For this we calculate the integrals involved in the matrix 
elements for the transition probabilities, assuming the electron to be free. The 
wave functions Y} and Y% are written in the form of plane waves 


y= CeGhPrr) | Y3 = CeG/APrr) | 


where C is the normalization constant. Substituting these wave functions into 
(103.2), we easily find 


Svi(īv e) e-ik-ry, d V= 
(103.10) 
= |C]? Jecihon (2 e: v) eG/AXpak)r dy ~ §(p,—Ak—p). 


Formula (103.10) expresses the momentum conservation law in the interac- 
tion of a photon with a free electron. Furthermore, the energy conservation 
law holds in the transition. Thus the following COEDS must be fulfilled 
simultaneously 


P2 =p, tAk, (103.11) 
E,=£, thw. (103.12) 


It is easily seen that eqs. (103.11) and (103.12) are inconsistent. An analogous 
conclusion, of course, also applies to the case of absorption. 

For the laws of conservation of energy and momentum to hold simul- 
taneously it is necessary that a third body, to which the excess momentum is 
transferred, be involved. In the case of atomic electrons such a body can be 
the nucleus of the atom. 











446 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 
§ 104. Dipole transitions in atomic systems 


The matrix element for the process of emission of a photon (103.2) can 
in most cases be substantially simplified. Usually the wavelength of the 
photon emitted is considerably larger than the linear size of the region of 
space in which the wave functions of the electron Y} and W, are considerably 
different from zero. 

For example, let the electron move in an atom whose effective radius is 
equal to a. Then the wave functions of the initial and final states are very 
small outside the range a. The energy of the electron in the field of the 
nucleus with effective charge Z* is in order of magnitude equal to Z*e/a. 
The change AZ in the energy of the atom in the transition and, consequently, 
the energy of the emitted photon is of the same order of magnitude. Then 
the length of the emitted wave is A = c/w ~ lic/itw ~ Nca/Z*e*. The ratio of 
the atomic size to the wavelength is of the order of 





For the outer electrons Z* ~ 1 and the wavelength is substantially larger 
than the atomic size. In the case of X-radiation arising in transitions in the 
K-shell of heavy atoms this approximation turns out to be inadequate. For 
>a the index of the exponential function inside the integral in (103.2) is 
very small within the limits of the effective range of integration, and hence 
the factor e~iK-T can be replaced by unity. 

The probability of the transition with emission, (103.7), is then rewritten 
in the form 


2 
dw = "1 2/241) d2 : (104.1) 
m*2nthic 
Here p, is the operator of the component of the momentum of the particle 
along the direction of polarization of the emitted quantum. 
The matrix element of the momentum operator can be expressed in terms 
of the matrix element of the coordinate. According to (31.7) and (49.5) we 
have 


a ii ir 
Py2 = myy =mi =F (E-E) t2 =- wd, (104.2) 


where d is the dipole moment of the particle. Substituting (104.2) into 
(104.1) we obtain 





———U— ee 





§104 DIPOLE TRANSITIONS IN ATOMIC SYSTEMS 447 


3 
dw = _ 1.) 912(mjtl) dQ. (104.3) 
2nhic3 z 


Here d, is the component of the dipole moment vector of the particle along 
the direction of polarization. We see that the transition probability (104.3) 
depends on the matrix element of the dipole moment of the particle and 
hence such transitions are called dipole transitions and the radiation is called 
dipole radiation. If the angle between (d),5 and the direction of polarization 
of the radiation is denoted by 0, then expression (104.3) can be rewritten as 


3 
dW =—®— |d5|2(nj+1) cos? 0 dQ . (104.4) 
2nhic3 


We sum this expression over the polarizations of the quantum. As independent 
directions of polarization we choose the polarization in the plane (d,k) and 
the polarization in the direction perpendicular to this plane. Expression 
(104.4) is then brought into the form 


3 
dW = _ id) 5/2(ny+1) sin? 9.dQ , (104.5) 
2nhc3 


where w is the angle between the vector dy and the direction of propagation 
of the radiation k. 


The intensity of emission per unit time into the element of solid angle dQ 


is obtained by multiplying (104.5) by the energy of the photon Aw. For 
spontaneous emission we have 


4 
J aQ =- Ida? sin? 92. (104.6) 
nC 


Integrating over angles we find the total spontaneous emission per unit time 


dE _4%wt 
we = | T (104.7) 
c 


This expression is very similar to the classical formula for the intensity of 
dipole radiation (see (27.9) of Part I). The difference between the classical 
and quantum formulae lies only in the fact that the averaged square of the 
dipole moment d2 involved in the classical expression must be replaced by 
the corresponding matrix element (doubled) 2|d 212 

Dipole transitions in the absorption of light can be considered in an 
analogous way. Setting eik-r= 1 in (103.1) and taking into account (104.2), 
we obtain for the transition probability per unit time 





448 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


3 
dW =—2— 15,12, cos? 0 d9 . (104.8) 
2nhc3 


Averaging this expression over all orientations of the vector d with respect to 
the direction of incident radiation, we find 


ae 
cos? 0 = zz Jos? 9dQ= 3. (104.9) 


Expressing nę in terms of the intensity of incident radiation Jọ(w) and 
multiplying (104.8) by ñw, we find the energy absorbed per unit time 


3 4n? e? 2 
JdQ=— pg wit) | Jo(w)dQ . (104.10) 


So.far we have considered the absorption and emission of a photon by one 
electron. If the absorbing or emitting system contains several electrons, then, 
disregarding the interaction between them, it can be assumed that formulae 
(104.10) and (104.5) will remain valid, provided that the dipole moment of 
the electron in them is replaced by the sum of the dipole moments of all the 


electrons. 


§105. Quadrupole and magnetic dipole radiation 


The matrix elements of a dipole transition are obtained from the general 
expression (103.2) when the exponential function e~iK* is replaced by unity. 
It may turn out, however, that the matrix element of the dipole transition 
reduces to zero, whereas the precise matrix element (103.2) differs from zero. 
In this case one has to expand the exponential e-iktT in a series, writing out 
the higher terms of the expansion. Then the emission probability will be 
different from zero, although substantially lower than the probability of 
dipole radiation. For this reason such transitions are called forbidden. The 
emission probability determined by the following terms of the expansion will 


have the form 





dw =e? (re(K-r)) 94 |2(2_ +1) do . (105.1) 
2mhe3 


The intensity of spontaneous emission in such a transition, analogously to 
(104.6), will be equal to 


——— 


_ 


§106 SELECTION RULES 449 


2AA; 
NOE a 
27c? 





I(r (k-r))211? sin? 9dQ . (105.2) 
Comparing this formula with the classical expressions (see Vol. 1, §31 of 
Part I), we see that (105.2) represents the magnetic dipole and quadrupole 
radiation. The probability of the forbidden radiation (magnetic dipole and 
quadrupole) is related to the probability of allowed dipole radiation as 
a2/d2 (k~)7!, rsa). If for some reason or other the matrix elements (105.1) 
are equal to zero, then the probability of radiation of higher order can be 
found in an analogous way. 


§ 106. Selection rules 


We see that the character of radiation from atomic and nuclear systems is 
determined by the matrix element d3; =er ,. Let us now establish when 
this matrix element can be different from zero, i.e. between which siates of 
the system transitions accompanied by dipole radiation are possible. The set 
of requirements which must be satisfied by the wave functions of the initial 
and final states of the system in order that the matrix element of the dipole 
transition ry; may not reduce to zero are called the selection rules for dipole 
radiation. The selection rules can easily be formulated in the general form if 
the wave functions Y; and y, describe the state of a particle moving in a 
centrally symmetric field. In this case the dependence of W} and Y3 on 
angles is characterized by spherical functions (see §35). For dipole transi- 
tions to be possible in the system the matrix element of the projection of the 
radius vector on the direction of polarization of the quantum e must be 
different from zero. Let us first consider a quantum polarized along the z-axis. 
In this case rẹ =z =r cos¥. The matrix element of the dipole transition will 
be proportional to the integral 


n 2n 


ff if Yim cos 0 Yim sin dô dy . (106.1) 
W w 


Here /,, mı and /5, m, are the quantum numbers of the states of the system 
before and after the emission of the quantum. Taking into account the defini- 
tion of spherical functions (30.16), the integral (106.1) can be rewritten in 
the form 





| 
| 
| 





450 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


m 2n 
J Pit? (cos 0)P} (cos 9) cos 9 sinddd f eimı-m2¥ dy. (106.2) 
0 0 


The integral over the angle y is different from zero only for mm, = my. The 
integral over the angle 3 then has the form 


1 
S PRG) xP) dx . (106.3) 
=F 


It can be shown that the following relation is valid for associated Legendre 
polynomials*: 

1+ \m{ 
21+) 


/—|m|+1 


21+1 fry x) . (106.4) 


xP” (x)= 





PI? (x) + 


Substituting this expression into (106.3) and taking into account the condi- 
tions of orthogonality of associated Legendre polynomials, we find that 
integral (106.3) is different from zero only for/, =/, +1. 

Thus we see that if the radiation is polarized along the z-axis, then the 
matrix element of the dipole transition differs from zero only for transitions 
with m =m], h =/, +1. 

Let us now define analogous selection rules for the quantum numbers /, m 
in the case where the quantum is emitted in the direction of the z-axis and, 
consequently, is polarized in the (x,))-plane. We consider the case of circular 
polarization with a phase shift equal to $7. Then the transition probability is 
determined by the matrix element of the quantity x + iy 


(xtiy), =(rsinde*¥),, . (106.5) 
Separating the integral over the angle y, we obtain 
2n 
if eini=m2+1)9 dy . (106.6) 


0 
This integral is different from zero under the condition 

mz=m,+£1. (106.7) 
The corresponding integral over the angle & is different from zero if /, = 


* See, for example, N.N.Lebedev, Special functions and their applications (Prentice 
Hall, Englewood Cliffs, N.Y., 1965). 


§ 106 SELECTION RULES 451 


1, +1. Thus the selection rules obtained for the quantum numbers / and m 
for dipole transitions can finally be formulated in the form 


Am =0, +1; Al=21. (106.8) 


It can easily be seen that the selection rules given by relations (106.8) express 
the angular momentum conservation law. The fact that / may change by one 
shows that in a dipole transition the emitted quantum carries away an angular 
momentum equal to one. At first sight this conclusion may seem to be 
strange. As a matter of fact, we have considered (formula (103.7)) transitions 
with the emission of a photon of given momentum. But in a state with given 
momentum and polarization the angular momentum of the photon does not 
have a sharp value. If, however, the wavelength of the photon is large in 
comparison with the size of the system, then it is possible to expand the 
function eK in a series. Carrying out this expansion and retaining the first 
non-vanishing term, i.e. the dominant term determining the value of the 
matrix element, we in fact separate photons with given total angular mo- 
mentum. To dipole radiation there correspond photons with angular mo- 
mentum one, to quadrupole radiation photons with angular momentum two 
and so on. Direct calculation, on which we cannot dwell, confirms this 
conclusion. 

Selection rules (106.8) automatically satisfy the requirements of the parity 
conservation law. Since the operator r is odd, the functions Y, and Y; must 
have different parity. Then the entire matrix element remains invariant under 
the replacement r > (—r). 

In deriving relations (106.1) the spin states of the electron were not taken 
into account, i.e. it was assumed that the spin state is not related to the 
orbital motion. In this case conditions (106.8) must be supplemented by the 
relation As=0 which expresses spin conservation in the dipole transition. 
However, if the spin—orbit interaction cannot be disregarded, as, for example, 
in the case of heavy atoms and nuclei, then it is necessary to formulate 
selection rules for the total angular momentum J. Taking into account that in 
a dipole transition the quantum carries away an angular momentum equal to 
one, then, according to the rule of addition of angular momenta in quantum 
mechanics we obtain 


Aj=0,+1 (excluding 0 >0 transitions) . (106.9) 


In this case transitions with Aj=0O are not forbidden, since the total 
angular momentum is not directly related to the parity of the state. The 
transition from the state j} = 0 into the state jọ = 0 is forbidden, since in this 
case the total angular momentum conservation law cannot be satisfied. 





452 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


In the case of magnetic dipole radiation the quantum also carried away 
an angular momentum equal to one. However, the magnetic dipole quantum 
has a parity opposite to the parity of the electric dipole quantum. This is 
associated with the fact that the magnetic moment operator does not change 
sign under the inversion of the system of coordinates, since the magnetic 
moment is a pseudovector. Consequently, the matrix elements of the mag- 
netic moment operator are different from zero only for transitions between 
states of the same parity. 

The electric quadrupole quantum carries away an angular momentum equal 
to two. In correspondence with this the total angular momentum selection 
rules are of the form 


Aj=0,+1,+2. (106.10) 
Transitions with the angular momenta 

0-0; t+}; 01l 
are forbidden. The change in angular momentum in the emission given by 
relations (106.9) and (106.10) refers either to one particle, if only its state 
is changed, or to the system as-a whole, for example to an atom or nucleus. 

If the system is in a certain excited state and dipole transition to a lower 
energy state is forbidden, then. the lifetime of the system in this excited state 
can be rather large. States of such a type are said to be metastable. In gases 
which are not very rarefied a metastable atom usually transfers its excitation 
energy in collisions with other atoms without emission. 

Transitions associated with the angular momentum change Aj =~ 4, 5, 
which are strongly forbidden, are observed in nuclei. The lifetime of the 
nucleus with respect to such a transition for small excitation energies may 
reach several months. Such nuclei are said to be isomeric. They were first 
observed by Kurchatov and Rusinov. 


§107. The photoelectric effect 


The process of absorption of a photon by a bound particle when the | 


energy of the photon exceeds the binding energy of the particle is called the 
photoelectric effect. In particular, in the photoelectric effect in an atom an 
electron in a state belonging to a discrete spectrum absorbs the photon and 
makes a transition to the continuous spectrum. The kinetic energy T of the 
electron knocked out of the atom is defined by the Einstein relation 

T=hw-I, (107.1) 
where / is the ionization energy of the atom. 





EE A 


§107 THE PHOTOELECTRIC EFFECT 453 


The momentum excess arising when the photon is absorbed is transferred 
to the nucleus. The more strongly the electron is bound in the atom, the 
more easily the momentum is transformed to the nucleus. Hence it is to be 
expected that the probability of occurrence of the photoelectric effect will 
have a maximum value for the most strongly bound electrons, the electrons 
of the K-shell. 

In what follows we shall restrict ourselves to the consideration of. this case. 
The matrix element of the transition with the absorption of one quantum has 
the form (103.1). The wave functions Y} and y, in the matrix element corre- 
spond respectively to the ground state of the electron in the atom and to a 
state belonging to the continuous spectrum. Since we do not take into 
account relativistic effects, it is obvious that the energy of the photon must 
in any case be small in comparison with the rest energy of the electron 
tw <me?. 

On the other hand, we exclude the region close to the threshold of the 
photoelectric effect, and assume that the energy of the photon is large in 
comparison with the ionization energy of the atom. Taking into account 
(107.1) and (38.17), these requirements lead to the inequality 


2 2.4 2 
r=? A me Ze 
2m 2n2 hu 





<1. (107.2) 


According to the results of §84, the fulfillment of inequality (107.2) means 
that the Coulomb field acting on the electron can be considered as a small 
perturbation. Consequently, one can take a plane wave for the wave function 
Wa of the free particle in the zero order approximation (disregarding the 
action of the Coulomb field on the free electron). The wave function of an 
electron in the K-shell can be written in the form of the hydrogen function 
with the effective nuclear charge Z (the action of other electrons on K-elec- 
trons is small). We then have 
L 
vy (Zy e“Zrla; yy = + eM) , (107.3) 
na (rh): 
We normalize the wave function of the final state Y, belonging to the 
continuous spectrum to the 6-function in momentum space. The transition 
probability per unit time, according to (56.8’), is equal to 





dW ==2 Hi 2(Ey—E, —Rw)p? dp dO. (107.4) 


Here dQ is the element of solid angle characterizing the direction of the 
momentum p of the emitted photoelectron, £} = —/, and £3 = p2/2m. 





| 
l 


454 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


We integrate (107.4) over the energies of the final state 


Dita, = 

dW = 1H \?p da . (107.5) 
The value of the momentum p is determined by relation (107.1). The 
integral in the matrix element (107.5) is easily calculated after the substitu- 


tion of the wave functions given above. For this we write it in the form 


e AN (a: n 
——— (i/h)(q-r) = (e- -Zrla ay 
oy m(2Th) (4 = fe i (Came à 


where eis the polarization vector of the photon. 

Here we have introduced the vector q=#k—p which represents the 
momentum transferred to the nucleus, and used the transverse property of 
electromagnetic waves e-k = 0. Integrating by parts we obtain 


e z? j -Zra li/h)\(q- 
ESA. . anay. 
z or (= aa E p) ferme 


Passing to spherical coordinates with the polar axis directed along the vector 
q, we find 


f e-Zrla e(i/Mq-1) gy = 27 if (eG) e-G/Aq:1))e-Zrlaydr = 
iq 
0 


a 8ra? 
Z3(1+q2a2/Z2n2)2 j 


Consequently, the matrix element H, is of the form 


3 \ 
ya o ) CDn (107.6) 
mh \7Z3 Vw (1+q2a2/n2z2)2 


We obtain the differential cross section for the photoelectric effect if we 
divide the transition probability per unit time (107.5) by the incident photon 
flux density. Since one quantum is absorbed and the process is normalized in 
such a way that in volume V there is one photon, then the incident photon 
flux density is equal to c/V. In correspondence with this we have 


do (107.7) 


_ DY IB en (Ss 2 dQ 
Z3?  (m?Phw ) 


mez (1+q2a2/n2z2)4 ; 








§ 107 THE PHOTOELECTRIC EFFECT 455 


The constant rg = e?/me? ~ 10-13 cm, as is well known, ïs called the classical 
radius of the electron. Expression (107.7) can be simplified. First of all, we 
denote the angle between the direction of momentum of the incident quan- 
tum and that of the emitted photoelectron by J, i.e. the angle between the 


vectors k and p, and the angle between the (p,k)-plane and (e,k)-plane by vy. 
Then, writing Ak = K, we have 


p-e=psind cosy, 
(107.8) 
q? =p? +K? —2pk cost. 


The expression 1+q2a2/n2Z2 involved in (107.7) can also be rewritten in a 
simpler form: 


1+ 








a) 2 2 2m2 
GCE O (ZE q2) = a? (Zr 


Se +p2+K2 2px cos). 
AZZ? n2z2\ a2 AZAN a : 


Taking into account that I= Z2e4m/2n?, it follows from relation (107.1) 
that 


Z2m2e4 
A2 


Then the preceding relation is rewritten as 


+p? =2mħw =2mke . 





Dae 2 2 
i ol ee K(2mctk—2p cos 9) = 2 
h2Z2 n2z2 





1272 2mħw(1—ßcos®), (107.9) 


where B = v/c. 

Here we have made use of the condition ke =ñw <mc2. The absolute 
value of the momentum p involved in (107.5) can be replaced, according to 
(107.1) and (107.2), by the quantity (2mhw)? . Taking into account relations 
(107.9) and (107.8), we obtain for the differential cross section (107.7) the 
following expression: 





5 2\3 sin2 2 
do=4 V3 2 a(s) SEU COSE Ba oy: (107.10) 
1374 hw} (1B cos 9/4 


Since expression (107.10) is obtained in the non-relativistic approximation, it 


makes sense only to within terms of the first order with respect to B. Hence 
we have finally 








456 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


5 2\3 

do = 4 V2 atk (z=) sin? 9 cos? y (1448 cos 9) dQ . (107.11) 
From expression (107.11) it follows that photoelectrons are emitted 
mainly in the direction of polarization of the photon 3 = 47, y= 0. Photo- 
electrons are not emitted in the direction of propagation of the quantum 
(9=0). As the energy of the quantum increases the maximum is considerably 
displaced in the forward direction. In order to obtain the total cross section 
for the photoelectric effect on the K-shell it is necessary to integrate (107.11) 
over all angles 3, y and in addition to introduce the factor 2, since there are 


two electrons in the K-shell 


5 2\ Z 
BINZ MEZ; 2 (mc) (107.12) 


3 T 1374 ro hw 








o= 


We see that the total cross section increases rapidly with increasing charge of 
the nucleus (as Z>) and decreases with increasing frequency of the quantum 


(as wt ). 


§108. The scattering of light by atoms 


As an important example of a process involving two photons we shall 
consider the quantum theory of the scattering of light by atoms. Let a 
photon with wave vector k, be incident on an atom, and let a photon with 
wave vector k, be emitted. We denote the corresponding frequencies by a, 
and w93, and the polarization vectors by e, and e3. If the frequency of the 
incident photon is equal to the frequency of the emitted photon, i.e. 
w =, then after scattering the atom returns to its initial state. Such 
scattering without change in the frequency is called coherent. We have seen 
in Part I that from the point of view of classical radiation theory only 
coherent scattering is possible. From the quantum-mechanical point of view 
scattering with a change in the frequency is as natural as coherent scattering, 
and was observed experimentally by Raman and independently by Mandel- 
shtam and Landsberg. It is called Raman scattering. 

For generality we shall assume that the state of the atom changes in the 
act of scattering. We shall suppose that the energy of the incident photon is 
lower than the binding energy of the electron in the atom, which corresponds 
to the visible region of the spectrum. For photon energies large in comparison 
with this quantity the electron can be considered as free. However, we shall 
defer the consideration of the scattering of a photon by a free electron (the 
Compton effect) to §127. 


§108 SCATTERING OF LIGHT BY ATOMS 457 


If in the initial state the energy of the atom is E, and in the final state £5, 
then the energy conservation law gives 


Nw = ftw, +E, - E>. (108.1) 


Let us write down the matrix elements of the interaction operator H' 
(102.2) for the scattering process. Since the process involves two photons, 
it is necessary to take into account the contribution of the operator A? to 
the matrix element. We denote the operator —(e/mc)(p- A) by Ai, and the 
operator (e? /2mc? )A2 by He. Then the total perturbation operator is 


H =H, +H. (108.2) 


Taking into account (102.5), it follows from (102.4) that the operator Å? 
gives a contribution to the process studied in the first approximation of 
perturbation theory 
2ne2 hi 
(A D Fava 
(co, w2)? 
For the frequencies considered the wavelength of the light is considerably 
larger than the size of the atom. In correspondence with this the exponential 
function in (108.3) can be set equal to unity. Also taking into account the 
orthogonality of the functions Yı and Y3, we have 





jf vet ke) t (e;:e)y dV. (108.3) 


73 
(Ê) =e = (Ges oer (108.4) 
(w 1)? 

The operator Hi has matrix elements different from zero only for processes 
involving one photon. In the case of scattering the operator Ht, can give a 
contribution to the transition probability only in the second order approxi- 
mation of perturbation theory. In §56 it was shown that in order to define 
the probability of a process in the second order approximation it is necessary 
to find the matrix elements of the operator H; corresponding to transitions 
into intermediate states. 

Two types of intermediate states, over which the summation should be 
carried out, are possible in the process of scattering. (1) In the transition from 
the initial to the intermediate state of the first type the photon k, is absorbed 
and the atom makes a transition into a certain state which we shall charac- 
terize by the index į (energy £;). In the subsequent transition from the 
intermediate into the final state the photon ky is emitted. According to 
(103.1) and (103.2) the matrix elements for the transition from the initial 


into the intermediate state and from the intermediate into the final state are 
of the form 








458 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


2 A . 
Gy =- = (2 in): [vi @epeiki ty, av, 
; (108.5) 
Ce ass < (72) Ji @ epek ty, av. 


(2) In the transition to the intermediate state of the second type the emission 
of the photon k3 first occurs. Then the photon k} is absorbed and the atoms 
makes a transition from the intermediate into the final state. We recall that 
the energy conservation law holds only for the initial and final states. The 
matrix elements of the transition via the second intermediate state are written 
as 


A e (2m0 \? 7 «a fess 
(Ayn = - = E) [vi @ ee kery, av, 





Vw 
Mus- (7) S[¥s@-e ek yar. (108.6) 
The constitutive matrix element A is given by formula (56.19) 
Ein; Eni Erw tiw) 


i 
The energy of the initial state is made up of the energy of the atom £, and 
the energy of the incident quantum fie), i.e. 


Enit = E1 $2, - (108.8) 
The energy of the intermediate state of the second type involves, in addition 


to the energy of the atom, the total energy of the two photons. 
Substituting expressions (108.5) and (108.6) into (108.7) we obtain 








2 (ex: P)2;(e1 P); , 1 D)2iA€2'B); 
Mase 2mh ! ( PIMP) C1 2 z a). (108.9) 
m2V(w 107)? 7 \ EE, £\-E;-fw, 


Here we have substituted unity for the exponential expressions eiKir and 
e~iko-r_ The summation over the energy states of the atom must also involve 
integration over states belonging to the continuous spectrum. The total matrix 
element for the process considered is obtained by adding to (108.9) the 
matrix element (108.4) 





§ 108 SCATTERING OF LIGHT BY ATOMS 459 

M _ 2ne? n E (eee 

= San es wee a 
21 mV (ww) Le E\-E;thw, 


(108.10) 


Eer Pher P) 
EE EO )*2leren)] 


The transition probability per unit time is given, as always, by the formula 
27 3 
aW = M211 p(w) dQ . (108.11) 


Here p(w >), as in §103, denotes the number of field oscillators in volume V 
corresponding to unit energy interval (see (103.4)). The solid angle element 
dQ characterizes the direction of the momentum of the scattered photon. 

Dividing the transition probability per unit time (108.11) by the incident 
photon flux density which, as in the preceding section, is equal to c/V, we 
obtain the expression for the differential cross section for the process 


W) 


_ 22} 1 (ex P)rkCr Dir, (1-P)2A€2 P); 
do=roe 


m~ Ey -E;*hw, E-E; ħw 
L 





7 
+6 )5(€)"e>) “dQ. 

(108.12) 
It is clear that formula (108.12) is inapplicable in the case of perfect 
resonance ftw, = Ey — E; 

Formula (108.12) characterizes the scattering capacity of the atom as a 
function of the frequency of the incident light, and hence is called, as in 
classical electrodynamics, a dispersion formula. The last term of (108.12) is 
different from zero only for coherent scattering w =, for which the 
initial and final states of the atom are the same. If the initial and final states 
of the atom are not identical, then the frequency of the scattered radiation is 
shifted with respect to the frequency of the incident radiation by an amount 
corresponding to the difference between the energy states of the atom 
(108.1). Scattering of such a type is called the Raman effect. 

Formula (108.12) can be rewritten in a somewhat different form if the 
matrix elements of the momentum are expressed in terms of the matrix ele- 
ments of the coordinates of the electron. First of all, it is convenient to write 
the last term in the bracket in (108.12) in the same form as the two preceding 
terms. 

For this we note* that, as a result of the commutation of the correspond- 





1 


* See the monograph of A.I.Akhiezer and V.B.Berestetskii, Quantum electrodynamics 
(Interscience Publishers, New York, 1965). 





460 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


ing components of the operators p and f, the scalar product e,-e, can be 
expressed as 


(ey-e9)=4 [er PXex1)— (rr Ve1B)) - (108.13) 


We shall take into account the fact that the scalar product e,-e, is in- 
volved in (108.12) only for coherent scattering if we take from (108.13) the 
matrix element of the transition from the initial into the final state. This 
matrix element is, by virtue of the orthogonality of the corresponding wave 
functions, different from zero only under the condition that the initial state 
of the atom is the same as the final state 


bi (ere2)= 7 Le rPXexr)- rt rP) = 


=z Dey Daler — (err)z;lerp)) 
f (108.14) 


We substitute the expression obtained into (108.12). The matrix elements of 
the momentum, according to (49.5), are expressed in terms of the matrix 
elements of the coordinate 


P) =H MEE O1 - (108.15) 


Adding up in (108.12) the corresponding matrix elements in brackets and 
taking into account the energy conservation law (108.1), we obtain the fol- 
lowing expression for the bracket involved in (108.12): 


TELD (e ees PAE COAG PIN 

in TT 7 BE, -E;*he, E -E;-lw ) a 
(108.16) 

This expression can be simplified if the following term, equal to zero, is added 

to it: 


Ma? [(e,-r\(eyr)—(en'r\(ey"T)]>} = 





MG) 
z = 2 (ezr)z;(er r) (Cy r)2;ezT);) - 
i 


———————_—————————eo- TT! 


§108 SCATTERING OF LIGHT BY ATOMS 461 


We then obtain 





à (exr)z err) (err); erT) 
lasan 2( EEn * a) 


Substituting (108.17) into (108.12) we find the final expression for the 
differential scattering cross section 


(108.17) 


eyt )y; (e'r e e 
aood (S EYEN CDa Li dQ. (108.18) 


E-E; hwg E -Ej hw 

A remarkable feature of the formulae obtained is the fact that for co- 
herent scattering they are the same as the classical formulae in §36 of Part I. 
Formula (108.18) is widely used in practice, since the study of Raman scatter- 
ing turned out to be a very effective method of investigating the energy levels 
and other properties of complex molecules. 

We note also that the matrix elements of the atomic dipole moment 
induced by light can be expressed on the basis of the relations obtained. In 
turn, as was shown in §34 of Part IV, it is easy to find the relation between 
the dielectric constant and the induced dipole moment for a rarefied gas. 
Hence formula (108.18) is the basis for the quantum-mechanical calculation 
of the dielectric constant e and correspondingly of the refractive index 
n=e?. In particular, for coherent scattering the following expression is 
obtained for the quantity 12: 





2 f: 
Mane N D i 
m i w2 

il 


5 (108.19) 
ea} 


where the quantity Ff, = 2ml wj |x; 17 is called the oscillator strength, 
fiw; =£;—E,, and N is the number of atoms per unit volume. We have 
chosen the direction of polarization of the photons as the x-axis. The 
following relation holds for the oscillator strengths f;:* 


aie 
i 
As was pointed out earlier the expressions obtained make no sense near a 


* See, for example, H.Bethe and E.Salpeter, Quantum mechanics of one and two 
electron systems, Handbuch der Physik, Bd. 35 (Springer-Verlag, Berlin, 1957). 





462 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


resonance. Investigation of this phenomenon, called resonance fluorescence, 
is impossible without introducing the concept of linewidth, to which the 
next section is devoted. There the formulae of scattering theory taking into 
account the linewidth will be presented. 


§109. The theory of the natural linewidth 


In §103—105 we have found the total probability of emission of a photon 
by an atomic system. A more precise investigation of the corresponding 
equations makes it possible to find also the frequency distribution of the 
intensity of the radiation, i.e. to determine the form of the spectral line of 
the radiation. We recall that such a problem can also be solved within the 
framework of classical concepts (see Part I). In this case in order to obtain 
the natural form of a line it was necessary to take into account the damping 
of the amplitude of the radiation oscillator. 

Although the interaction of charged particles with the electromagnetic 
field is weak, we cannot confine ourselves to the usual approximation of 
perturbation theory in considering the form of a spectral line. As a matter of 
fact, finding the form of a line is associated with the necessity of taking into 
account the decay of the initial state of the atomic system. Such a decay 
takes place over a rather long time ¢ and, naturally, cannot be taken into 
account by the methods of perturbation theory (see §56). 

We shall proceed from the general system of eqs. (55.5) for the amplitudes 
Cm Of unperturbed states (atom and radiation field). We shall denote the 
amplitude of the initial state of the system by y(t). In this state there are no 
photons, and the atom is in an excited state with energy £. In the system of 
eqs. (55.5) it is sufficient to take into account only those states whose energy 
is approximately the same as the energy of the initial state. Only these states 
play an important role. We shall assume for simplicity that the state with 
energy E> is the first excited state of the atom, and that the energy of the 
ground state is equal to £} 


E, - E; =fw - 


The emitted photon has a frequency w close to the frequency wg. We 
shall denote the amplitude of the state arising in the transition 2 > 1 by f'(t) 
(the atom makes a transition into the ground state and emits a photon with 
wave vector k, frequency w, and polarization e). 

In the case given the system of eqs. (55.5) will have the form 





ee ——————EEEeEeEEEeEy5VyGeeE_e 


§ 109 THEORY OF THE NATURAL LINEWIDTH 463 


if, (0) = 1,1, |H'12,0) e MOA“ PO (t) , 
(109.1) 


imp" (t) = DF (2,01H"11,1,) MPA- 0)" fs (ry, 
A 


From the initial conditions it follows that A0) = 0, y'(0) = 1. We introduce 
the notation 


KOERN, 1,112.0 =A, , 
(109.2) 
glt) =y (t) ewo , (2,01H'|1,1,) = Ha =H- 


The Hamiltonian of the interaction of charged particles with the electro- 
magnetic field, H', is given by expression (102.2). The form of the line is 
determined by the value of [f, (DI? = If, (OI? for t >. 

Substituting expression (109.2) into (109.1) we obtain the system of 
equations of the amplitudes f} (¢) and y(t) 


nfa) =H 22l) tha, h(), 


(109.3) 
ihg'(t) = Rargy(t) + 2 HDAC- 
À 


The system of eqs. (109.3) is conveniently solved by means of a Laplace 
transformation. We denote the amplitudes in the Laplace representation by 


hp) 


f= | AOP ar, 
0 


(109.4) 
1 i2O+S 

RO) S AP dp . 
~jOO+5 


Taking into account the initial conditions, we have 





464 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


DA) = FH oP) + Oxf.) 


(109.5) 
pe) =it wy) + 5-20 Hh KO). 
aA 


Eliminating the amplitude y(p) from this system, we obtain the equation 
for the function f (p) 


(ip—w, Xip-w0) p) = Tui PE mano (109.6) 


Multiplying the left-hand and right-hand sides of the equation by the function 
HD and summing over À, we find the expression for the sum 


noH of, @) = Tipsy’ (109.7) 


where y is defined by the relation 


HI" 


pe a 109.8 
ja! oy ry ( ) 


2 


We pass from a discrete to a continuous distribution of frequencies and 
replace the summation by integration. The denominator of the left-hand side 
of expression (109.8) contains an infinitely small imaginary positive addition. 
The integration over the frequency w amounts to taking the integral in the 
sense of the principal value and the semiresidue at the point w, = ip 


fee. Š pfe inF(ip) , (109.9) 


where F is an arbitrary function. 

We disregard the integral taken in the sense of the principal value (it gives 
a small real shift of the spectrum of radiation frequencies). Since in integrat- 
ing over the Laplace variable p the basic contribution is given by the region 
where ip ~ wọ, in finding the value of y we can immediately take the residue 
at this point. 

We have finally 





§ 109 THEORY OF THE NATURAL LINEWIDTH 465 
yi dQ [2h 109.10 
hi z (27c)? 


On the right-hand side the summation is carried out over the polarizations 
and the integration over the directions of the vector k of the emitted photon. 
From comparison of (109.10) with the expressions obtained in § 103 it fol- 
lows that the quantity y is the total probability of emission of a photon by 
the atom per unit time. 


Substituting expression (109.7) into (109.6) we find the amplitude f,(P) 
412 


R(ip—wotziy)ip—w, ) ” 





A@)= (109.11) 

From the transform (109.11) we find the original, the function f} (£) (see 
(109.4)). Closing the integration contour in the left half-plane of the complex 
variable p and determining the residues, we find 


H) 2 eiw’ 


ETEEN EE] (109.12) 
A 2 


f= 
The expression AO, taken after the lapse of a sufficiently large time 
t> yl, defines the probability of emission by the atom of a photon with 
given polarization and given wave vector k, i.e. defines the form of the 
emission line 


(HI? l 





If, (I? = (109.13) 


A2 (W.-W)? + ky? 
The intensity of emission by the atom of a photon with given frequency w, 
J(w), is obtained from (109.13) multiplying IA (12 by fiwp(w) (see 
(103.4)) and summing and integrating over the polarizations and the direc- 
tions of the vector k of the emitted photon. Taking into account (109.10) we 
have 


Y hiw dw 


J(w)dw = 7r 


(2, POP À (109.14) 

The intensity distribution of the radiation is of a dispersion character, and 
the distribution width y is equal to the total probability of emission per unit 
time. The relation obtained between the distribution width and the transition 
probability is in correspondence with the general Heisenberg uncertainty 
relation for time and energy AFAt >h (see §34). Here AE is the uncertainty 


in the energy of the excited state, AE ~hHAw, Aw is the distribution width 





466 SECOND QUANTIZATION AND RADIATION THEORY Ch. 12 


and At is the mean lifetime of the atom in the excited state; At ~ y~!. since 
the emission proceeds in a time of the order of y~!. Hence it follows that 
Aw ~ y in correspondence with the result obtained. 

In the case where the transition takes place not between the first excited 
and ground levels but between arbitrary ith and kth levels, the width of the 
transition line yz is equal to the sum of the widths y; and y, of the levels 


Mee Wee a (109.15) 


Each of the level widths y; and y, is equal to the sum of the probabilities of 
transition from the given level to all lower levels. 

The width considered here is called ‘natural’, since it is determined by the 
process of emission itself, by the radiative reaction. In addition there are 
other mechanisms of spectral-line broadening, which usually lead to more 
noticeable effects. Thus, for example, in a gaseous system the collision 
broadening and the Doppler broadening are considerable*. The collision 
broadening is due to collisions between the molecules. Indeed, collisions 
interrupt the process of emission. Hence, if 7 is the lifetime of the atom with 
respect to collisions (the mean time between collisions), then, as follows 
from the uncertainty relation for time and energy, the linewidth is of the 
order of 77}. 

By taking into account the linewidth the results obtained in the preceding 
section can be extended to the region near resonance, where iw ~ E, — E], 
and where Æ, is the energy of one of the atomic levels. In this case the basic 
contribution in the summation is given only by states with the ‘resonance’ 
energy £,,. Thus for the differential coherent scattering cross section we have, 
instead of (108.18), 

E (ey DEADE E 

c Œ; -E thw)? t mae 
where DT, is the total width of the mth level. The summation is carried out 
over all states with energy E,,. 


Carrying out the summation and averaging over the initial states and 
summing over the final states of the system, we obtain 


2, +1 ye 
27, +1 (E -E thw)? +1172’ 


(109.16) 





o= (109.17) 


* This problem is considered in detail, for example, in the book of I.I.Sobelman, 
Vvedenie v teoriyu atomnykh spektrov (Introduction to the theory of atomic spectra) 
(Nauka, Moscow, 1963). 








§109 THEORY OF THE NATURAL LINEWIDTH 467 


where y,, is the natural width of the nth level, X = c/w, andj, and j„ are the 
total angular momenta of the initial and the nth atomic states. The cross 
section reaches a maximum value equal to 47X?(2j,,+1)/(2j, +1) for perfect 
resonance and y, = T,- 











13 





Relativistic Quantum Mechanics 


§110. The relativistic wave equation for a particle of zero spin 


So far we have restricted ourselves to the study of the properties of 
particles moving with velocities small in comparison with the velocity of light. 

Indeed, in obtaining the Schrédinger equation we have written the non- 
relativistic Hamiltonian of a particle in an external potential field 


H-P UC) 
Im r 
and have replaced the corresponding quantities in it by operators. To obtain 
the relativistic theory, use should be made of the same scheme as that devel- 
oped in §27. Namely, to construct the wave equation one has to use the 
relativistic expression for the Hamiltonian. For generality we shall imme- 
diately assume that the particle moves in an external electromagnetic field. 
Then its Hamiltonian has the form of (23.17) of Part II. Carrying out the 
replacement of corresponding quantities by operators, i.e. H > ih(d/dt), 
p> —iAV, we obtain the equation 


2 2 
(n2 -ev) v=@(-nv-ża) Y +mĉ?cty . (110.1) 
Equation (110.1) is called the Klein—Gordon—Fock equation. The relativistic 


468 


§110 PARTICLE OF ZERO SPIN 469 


invariance of this equation is evident. The Klein—Gordon—Fock equation is a 
second order wave equation. 

Since the relativistic Hamiltonian goes over in the limit into that of 
classical mechanics, it is natural to assume that for c >œ the Klein—Gordon— 
Fock equation will go over into the Schrödinger equation. We shall show this. 

Since the zero point energies in non-relativistic theory and the theory of 
relativity differ by mc?, it is convenient to introduce a transformation of the 
wave function y by means of the following relation: 


W(x,t) = y'(x,t) einc th ; 


Substituting into (110.1) and calculating the derivatives with respect to 
time, we obtain 
ae 2 OW’ aaa + cad: OW alkgea aay Mate laws Camere 
2ihme? —— —h? —2ey |mc2y'tih — |+ e2y2W' =c? (Pp——A) y’. 
be or ar2 ? Y ðt or p c Y 





(110.2) 


We retain only terms proportional to c? in this equation. Upon dividing 
both sides of the equation by 2mc? we arrive at the ordinary Schrödinger 


equation 
5 
ms Oye 
ay’ (6-5) 


NC CE ap af 
ih a7 Oi y tey . (110.3) 


Thus we have shown that the Klein—Gordon—Fock equation goes over into 
the Schrodinger equation in the non-relativistic limit. 

Equation (110.1), like the Schrédinger equation, defines the development 
of a process in time. The state of the particle is, as before, characterized by 
the wave function W(x,y,z,t). This function depends on the coordinates 
x, Y, Z, t and contains no spin variables. Hence it is clear that the Klein— 
Gordon—Fock equation defines the behaviour of a spin zero particle. In order 
that this equation may describe particles of spin different from zero it must 
somehow be modified. 

Since the Klein—Gordon—Fock equation is relativistically invariant, the 
wave function can be multiplied only by a certain constant phase factor in 
Lorentz transformations. From normalization considerations it follows that 
this factor must be equal to +] (but not —1, since the Lorentz transformation 
is continuous). Under space reflection of the coordinates the wave function 
Ww can be multiplied by +1 or —1. In other words, under the action of the 
parity operator the wave function can transform in two ways: 








470 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


Îyæy,z,t)= +W(—x,-y,-2,0), 
IW @,y,2,t) = —W(—x,-»,-z,1) - 


Thus the wave function y can be either a scalar or a pseudoscalar. For this 
reason the Klein—Gordon—Fock equation is often said to be a scalar equation. 

As an example of the integration of eq. (110.1) we shall consider the case 
of a free particle. Then the Klein—Gordon—Fock equation can be written in 
the form 


—h? OY = PAY +m?cty . (110.4) 
ar? 
We see the solution of eq. (110.4) in the form 
y = eriBtlh ys (x,y,z) . 
Then for the function Y} we find 
E2y, =—c?h?V2y, +mrcty, . 
Rewriting this equation as follows 


E? —m?c4 


V2, + 
l c?h?2 


D20, (110.5) 


we easily find that its solution is a plane wave of the form 
Yı = ae P D/A . 
Substituting this value for Y} into (110.5), we arrive at the relativistic rela- 
tion between the energy £ and momentum p of a free particle 
E2 = c¢?p? + m?ct . 
This relation is the same as the usual formula for energy in the theory of 
relativity. 


§111. The charge density and probability current for particles of zero spin 


We now turn to finding the charge density and probability current for 
particles described by the scalar wave equation (110.1). The derivation of 
expressions for these quantities is carried out according to the same scheme 





= 


§111 CHARGE DENSITY AND PROBABILITY CURRENT 471 


as for the Schrödinger equation. Namely, we multiply the Klein—Gordon— 
Fock equation 


2 2 
(i zev) Y =c? (-wv—£a) v—m2cty =0 (111.1) 


by the adjoint wave function y*. 
We multiply the equation conjugate to eq. (111.1) by the wave function w. 
We subtract the second equation from the first. As a result we find 


21) 2 * 
-f2 [v CAVA) r] — 2ifiey (v 2 wry =v) = 
ar2 ar2 t ðt 

(111.2) 


—ti2c? [y V?y*—y* V?y] —iħec[Y*(VA+AV)Y + (VA+AV)4*] =0. 
The first expression in the bracket is easily transformed into the form 


a 8. (pe OV ag 
a == sah Lie setae} 
ree Aer ya hier yo Ar a 


The second bracket is simply the derivative with respect to time of the 
product y*w. By means of the vector relation 





WV-W* =U VV" = V: (Ww) — voy" 
we find that 
WV2y* —w* V2 = V: [yVy*—y* Vy]. 


Finally, the last expression in the bracket of formula (111.2) is brought by 
means of the relation 


W*(VAWW =W* V-(WA)= V-(W*WA)—WAVy* 
to the form 
W*(VAtAV)Y + W(VAtAV)W* =2V- (WW*A). 


If one multiplies all terms of eq. (111.2) by the quantity e/2ifimc? and 
makes use of the transformations shown, then eq. (111.2) can be written in 
the form 


+28 
y I* 3; o 


where the charge density p is equal to 





472 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


mes ten [vx wv) wo. (111.3) 


2ime? ot mc? 





Then we have for the current density the expression 


i= yvy gle vy] - wa. (111.4) 
Let us dwell on the meaning of the results obtained. In non-relativistic 
theory the charge density p can be written in the form 


P(x,y,2,0) = eW(x,y,z,t) , 


where W(x,y,z,t) is the probability current, which is in essence a positive 
quantity. Clearly, relation (111.3) cannot be interpreted in such a way. The 
expression for p can be made negative by a proper choice of the function y 
at the initial instant of time. 

Indeed, since the Klein—Gordon—Fock equation is a second order equation 
with respect to time, arbitrary values of the W-function itself and of its 
derivative with respect to time can be given at the initial instant of time. 
Choosing different Y and dy/Odr it is possible to arrive at positive as well as 
negative values of the quantity p. 

We shall now show that the quantity p/e has indeed as its non-relativistic 
limit the product y*y. Let the following relation be fulfilled for the 
derivative of the wave function y with respect to time: 


wag Ey. 


In this case the expression for the charge density p can be written in the 
form 


eE e2 
p=—— vy- yy. 
mc me 
If we separate the rest energy from the energy Æ, i.e. if we set E = mc? +E’, 


then inthis case we easily obtain 
Ba yry [14422]. 
me? 


If the quantity E'—ey <mc?, then we have the correct expression for the 
non-relativistic limit of the quantity p/e. We see that in the case of the Klein— 
Gordon—Fock equation one cannot introduce a positively defined probability 





§112 THE NUCLEAR FORCE FIELD 473 


current. This fact was the reason why, for a long time, the Klein—Gordon— 
Fock equation was not applied to real objects. 


§112. The concept of the nuclear force field 


The Klein—Gordon—Fock equation was subsequently given a new, com- 
pletely different physical interpretation. 

We already know that in addition to electrical interactions other forms of 
interaction occur in nature. In particular, one such interaction, which does 
not depend on the electric charge e, is the strong nuclear interaction. It 
seemed natural to assume that the nuclear interaction could be associated 
with the presence of a special nucleon charge g in nucleons. One can then try 
to describe the nuclear interaction by analogy with the interaction of electric 
charges, introducing the concept of a nuclear force field. This field should be 
described by a potential similar to the potential y of the electric field. The 
attempt was made to give up the interpretation of the Klein—Gordon—Fock 
equation as the equation for the wave function of one particle. Instead it was 
proposed to consider the function as the potential of the nuclear field 
produced by nucleons. Just as photons are quantum particles corresponding 
to the electromagnetic field, there correspond to the nuclear field 7-mesons. 

In §67 we have already dwelt on the obvious interpretation of the m- 
meson exchange and photon exchange as the sources, respectively, of the 
strong nucleon interaction and the electromagnetic interaction. 

Carrying this analogy further, we can go on to write the equation for the 


nuclear field potential. As such an equation we take the Klein—Gordon—Fock 
equation in the form 


2 
v2y —— —= —x2y=0, K2 = m?c? ñ? , (112.1) 
c2 912 


where m is the mass of the m-meson. 

We shall not consider the quantum theory of nuclear forces, in which 
mesons are the elementary excitations of a certain field, like photons in the 
quantum theory of the electromagnetic field. 

Since we are interested only in the qualitative aspect of the subject, we 
shall carry out our reasoning by analogy with the classical theory of the 
electrostatic field. Assuming that the nuclear field does not depend on time, 
we write for its potential y the equation 


V?y—K?y =0. (112.2) 








474 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


This equation is a certain analogue of the equation of the electrostatic field 
and goes over into the latter for m > 0 (see below). As is well known, in the 
presence of point charges the equation of the electrostatic field is of the 
form 


V2 = —4ne6(r). 


Hence in the presence of a nucleon at the point r=0 it is natural to give 
eq. (112.2) the form 


V2W —K2W = —4rg6(r) . (112.3) 


We shall find the solution of this equation satisfying the condition y > 0 for 
rao, 
We seek w in the form 


VO)= [vei ak. 


Then, using the expansion of the 5-function in a Fourier integral (see Vol. ils 
Appendix III) and eq. (112.3), we find for Yẹ the value 





1 
y s 
ls 2n2 k2 +K? 
For the field y we obtain an expression which is conveniently written in the 
form 


elkrcosé x2 dk 


2 
= sin 0 dé dy . 
2 ia k2 +K? 


The integration over the angles y and @ gives 


a2 ~ k sin kr 


ar) and (112.4) 


y= 
In integrating this expression it is convenient to introduce the range of inte- 


gration over k from — to tee. In this case we have 


A ikr 
a 
A REN 


This integral is easily calculated by means of the theory of residues 








§112 THE NUCLEAR FORCE FIELD 475 


~ ker 
if ga Ok = 2ni Res (kik) = mie™ , 
ape: 


— o0 
whence we obtain the expression for the nuclear field potential 


y= 


~ [99 


emr , (112.5) 


called the Yukawa potential. 

Formula (112.5) shows that the potential of nuclear forces decreases 
exponentially with increasing distance. The effective region in which w is 
different from zero has the size 


Rk7=h/mce. 


The size of this region is in order of magnitude the same as the range of 
nuclear forces determined experimentally. 


For m =0 the potential y goes over into the potential of the electrostatic 
field 


Y=sr. 


Thus the quantity g indeed plays in the Yukawa potential the same role as 
the charge e in the electrostatic potential, and can rightfully be called the 
nucleon charge. It should be stressed that the calculation carried out above 
can by no means pretend to be a quantitative characteristic of the nuclear 
force field. 

In reality the interaction between nucleons is not of a static character. For 
a correct treatment of the processes of virtual 7-meson exchange it is neces- 
sary to quantize the m-meson field y defined by the Klein—Gordon—Fock 
equation. This means that the function y and the adjoint function wt must be 
considered as quantum-mechanical operators in the space of occupation 
numbers. These operators have matrix elements different from zero for the 
processes of absorption and emission of 7-mesons. The interaction between 
nucleons should be calculated by methods analogous to those applied in 
radiation theory. 

We have seen in ch. 12 that the mathematical apparatus of radiation 
theory is based on the application of perturbation theory. The dimensionless 
interaction constant e2/hc, made up of the charge of the particle and of the 
universal constants A and c, figures as the small parameter. The strong nuclear 
interaction can also be characterized by the interaction constant g2/hc. 
However, and here lies the profound difference from the electromagnetic 








476 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


interaction, the quantity g2/fc is of the order of ten. Thus the effectiveness 
of the nuclear interaction exceeds that of the electromagnetic interaction by 
a factor of more than a thousand. The term ‘strong nuclear interaction’ is 
associated with this fact. The large value of the interaction constant g2/fic 
makes it impossible to use the apparatus of perturbation theory for the 
calculation of nuclear interactions. 

This fact reflects the change in the physical nature of the interaction in 
passing from charged particles to nucleons. The smallness of the electro- 
magnetic interaction constant means that the probability of emission of M 
particles in one act is proportional to (e2/fc}V <1. In other words, the 
probability of emission of one (actual or virtual) photon is considerably 
higher than that of the simultaneous emission of two, three and so on 
photons. 

The situation is different in the case of the strong nuclear interaction. 
The probability of simultaneous emission of a large number of mesons is of 
the same order of magnitude as the probability of emission of one meson. 

Hence each nucleon should be considered as a particle surrounded by a 
cloud of virtual 7-mesons. 

The validity of such a picture is confirmed by the phenomena of multiple 
production of 7-mesons in collisions of high-energy nucleons. 

Thus the picture of the 7-meson interaction of nucleons turns out to be 
much more complex than that of the photon interaction of charges. The 
interaction between two nucleons involves without fail a multitude of r- 
mesons and its consideration must be based on the solution of the many- 
body problem. No consequential quantitative theory of the strong nuclear 
interaction has as yet been developed. 


§113. The Dirac equation 


In the preceding sections we have considered the relativistically invariant 
wave equation valid for spin zero particles. We have seen that the quantity 
p/e, which should be interpreted as the probability density, can take on 
negative as well as positive values. 

As can be seen from formula (111.3), this is associated with the fact that 
the value of p/e is determined not only by the initial value of the y-function 
but also by the initial value of the derivative dW/dr, defined arbitrarily. It is 
clear that in order to eliminate this difficulty it is necessary to eliminate the 
possibility of an arbitrary choice of the derivative dW/dr. In other words, it is 
necessary that the relativistic generalization of the Schrödinger equation 


§113 THE DIRAC EQUATION 477 


contain only the first derivative with respect to time, as does the Schrédinger 
equation itself. Since, however, all relativistically invariant expressions must 
involve coordinates and time in the same way, he relativistic generalization 
of the Schrédinger equation should also involve first derivatives with respect 
to the coordinates. 

The principle of superposition requires that the relativistic wave equation 
be linear. On the basis of these considerations Dirac formulated the following 
equation for the description of the motion of a free particle: 





Expression (113.1) represents the most general linear form containing only 
the first derivatives of the function sought. This equation is conveniently 
rewritten in a somewhat different form by redefining the quantities B’. 
Namely, we write it in the form 


a OW x AS PS EN nA 
it Sp = GPx tÂy Py tb P tbo » 


where the operators p,, By, p, are the ordinary operators of the components 
of the momentum along the coordinate axes, and the operators Ês. Ê shes Bo 
contain no coordinates. We determine the properties of these operators from 
the following reasoning. Introducing the notation 


H= BB +Ê Py +63, + Bo ? 


eq. (113.1) can be written in the form 
Cw = AY 2 
y Hy, (113.2) 


which is completely, although as yet only formally, similar to the Schrödinger 
equation. 


If it is assumed that the operator H indeed represents a Hamiltonian, then 
there must be the same relation between H and the momentum Operators as 
between the energy and momentum in the theory of relativity, i.e. 


H = c? (pz +B; +p?) +m2c4 , (113.3) 


This requirement allows one to define the operators B, By, BA Bo- Indeed, 
squaring the operator H we obtain 








478 RELATIVISTIC QUANTUM MECHANICS Ch. 13 
H? = f2p2 + B2p2 + 8262 +Â + 

ay GB, +8, 8, BP, + Ê b- +8- P -Py + Beb- 48,8, BP. i 

+ (Êxbothobx)Px + 6, Bo+Bo8, By + (6.8 B86, PE (113.4) 


The operator A? will have the form of (113.3) if the following relations are 
fulfilled: 


ĝ2 = 62 = g? = @ RZ = m4 , 
Bb. +B,8;=0 Gk),  — BiBy +Bo8, =0 


Here i and k take on the values x, y, z 
Usually in place of the operators 6; one introduces operators a; which 
differ from the former by constant factors: 


A A A A = 2 
By = Geese By = CQ, ; bz SG g Bo = me- 


The following relations obviously hold for the operators a and B: 


a2 =02 =a) =p? =1, a; + Ba; =O , 
ajap +a,a;=0 (i#k). (113.5) 


By means of these operators eq. (113.1) can be written in the form 
iñ OW = 2) p p.)+me? 
Le [c(a,P, ta Ê, ta Pz) me~Blw . (113.6) 


This equation is called the Dirac equation. 
If a vector operator is introduced by the equality 


a= ait a,j tok, 


then the Dirac equation can be written in a still more compact form 


in SY = Ay, H=ca:‘ptme2s. (113.7) 


We now seek the explicit form of the operators a,, Qy Az, B. We note, first 
of all, that the actions of these operators cannot reduce to the multiplication 
of the wave function by constant numbers. By means of such operators it 
would be impossible to satisfy relations (113.5). 

Let us try to seek the operators a,, a, @,, B in the form of a set of 
constant, in general, complex numbers, i.e. in the form of the square matrices 


§113 THE DIRAC EQUATION 479 


Sy 2120 ein 

COW YY CHI 
Qy = 

an} an2 -= ann 


We first define the number n, which we assume to be the same for the 
matrices œ and £. For this we make the matrices œ and 6 correspond to the 
determinants 


G Ma) at 

Gigi” Clay), et CY 
deta, = 

an} 4n2 se Ann 


Before proceeding to further investigation of the matrices we note that the 
following relation must be fulfilled for the determinant of the product of the 
matrices: 


deta,B = deta, detp . (113.8) 
From the commutation rules it follows further that 
a Bi= pa = e 
Here / is the unit matrix. Making then use of relation (113.8) we find 
det a,b = deta, det B= det (—/) detBdeta, . 
Since the determinants are ordinary numbers, we find that 
det (—/)= 1 
and, consequently, 
C=. (113.9) 


Thus the number n must be even. If the number n were equal to two, then 
the matrices sought would be two-by-two matrices. We have already en- 
countered such matrices in §59, where it was shown that there are 4 linearly 
independent two-by-two numerical matrices: 3 Pauli matrices and a unit 








480 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


matrix. This last commutes with all the Pauli matrices and, consequently, does 
not satisfy the condition of anti-commutation (113.5). On the other hand, in 
the case of four-by-four matrices it turns out to be possible to construct 
matrices with the properties required. Namely, by a simple check it can be 
seen that the matrices 


O A A 0 0 0 -i 

ORORO lo od 0 
o o KAE o 0 

ey 9 2 aa (113.10) 

® i 0 OM 

© 00 J -Ho nwn o g 
o 0 0}. f= © o =n © 

Q =i 0 0 00o O i 


satisfy all the requirements formulated above. Matrices (113.10) can be 
written in an abbreviated form by ‘making use of the Pauli matrices. Indeed, 
from definitions (60.14) and (60.15) and (113.10) it is clear that we‘have the 


relations 
0 (oj 0 o 
=) h i) 
o, O o. 0 


y (113.11) 
© 1 0 
oo ls ote (aman 
a. 0 0 —1 


The matrices a and B are Hermitian matrices. This can be established by a 
simple check. If we transpose the matrices and carry out complex conjugation, 
then the matrices obtained will be the same as the original ones. Hence for 
these matrices one can write that al = Oho a}, =a, al =a, and gi =p. 

If in place of the four-by-four matrices we had introduced matrices of a 
higher rank, then the formal set-up of the theory would not be violated. 
However, as will be clear from what follows, when the four-by-four matrices 
are introduced the general Dirac equation describes the properties of spin 
one-half particles. 

Taking for a,., &,, œ, and 6 the matrix expression (113.10) in the form of 
four-by-four matrices, we have to assign four components to the wave func- 
tion y. Indeed, only in this case do the 4 equations, into which the general 





ji 


§113 THE DIRAC EQUATION 481 


expression (113.7) resolves when four-by-four matrices are substituted into it, 
contain four unknown functions. The four-component function W (called the 
Dirac bispinor) can be written in the form of the matrix 


Wy 
v2 
Y3 
V4 


We write down these equations in explicit form, making use of the rule of 
multiplication of matrices (see (45.6)): 


fd 
2 av, N N e 3 
ih ya = c(Px Py 4 +cp,W3 +me-y, , 
Oa ae K 5 
if yn = c(P, tip, 3 —cp,W4tme- >, 
Nie» Somme i 5 
if ye CP Py 2 +c, Y; —me-W3 , 
r OY dbname h 5 
if -gr CPx ty 1 —cp,—me-W4 - 


The Dirac equation is easily generalized to the case of the motion of a 
charged particle in an external electromagnetic field. Namely, replacing the 
momentum operator p by the operator p—(e/c)A according to the usual 
scheme and adding the operator ey to the operator H, where Aand gare the 
vector potential and scalar potential of the electromagnetic field, we obtain 
the Dirac equation 

J A 
if 2 = [ca (6-< a) epic? yY. (113.12) 
Let us bring the Dirac equation into a more symmetrical form. Multiplying 
eq. (113.7) by the operator 6 from the left-hand side we have 


inp SY = (capim H?) y . (113.13) 
We now introduce the following system of matrices: 

Ty = ibas, Y= Boy 

mEn WSP: (113.14) 


2) rEg CORD 








482 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


It is easy to verify that the commutation rules for the matrices y; are the same 
as for the matrices a and 8, viz. 


VV + YkYi= 25ix - 
By means of the matrices y; we can rewrite the Dirac equation in the form: 


3 
ow me Y4 ə 
zu Mae T U Ate 0. (113.15) 


If we now introduce the coordinate x4 = ict we can transform the Dirac equa- 
tion into a highly symmetrical form: 


ow +e 
% ax, y=0. (113.16) 


With the aid of the operators dD, and A, the last equation can be rewritten in 
the form 


[ru (6,-2 4,) -i me | y=0. (113.17) 
§114. The probability density and probability current in Dirac’s theory 


We shall show, first of all, that the difficulty of interpreting the probability 
density p/e which we have encountered in discussing the Klein—Gordon—Fock 
equation is absent in the Dirac equation. Following the usual scheme, we 
write in addition to the Dirac equation 


ii a AVAE ZV (114.1) 
the adjoint equation 

aig Pee ai =incvytat Ea . (114.2) 
Here we have made use of the rule of conjugation of the product of matrices 

(ab)t =biai . 


Since the operators a and $ are Hermitian, then at = a, Bt =, and we obtain 


ip ZL om =ihcVWie +mc? yig. (114.3) 


§115 SOLUTION FOR A FREE PARTICLE 483 


We multiply eq. (114.1) by W7 on the left, and eq. (114.3) by W on the right 
and subtract the second equation from the first. We have 


; + OW , Ovi $ + 
in (vi o Y) =-neWtavynvytay) . (114.4) 


The parenthesis on the right-hand side of eq. (114.4) means that the 
gradient acts only on the function Yt. The expression standing in the bracket 
can easily be transformed by means of the formula 


Vİ (aV)Y + (Vota = Vi Vay +(Vyta)y = V(viay). 
Equation (114.4) is then written in the form 


piy = 

YY ey(ytay). (114.5) 
Comparing the expression obtained with the general formula (7.5) we see 
that the essentially positive quantity Wi = wiv, +W5v,+ ¥33 + WiVa 
represents the probability density. The vector defined by the equality 
j=cŅtay gives the probability current for a particle with the wave function 


y 





Thus, as in the Schrödinger theory, the wave function allows the usual 
probabilistic interpretation. From the linearity of the Dirac equation and the 
probabilistic interpretation of the function w it follows that the basic proposi- 
tions of quantum mechanics remain valid: (1) the interpretation of the 
quantity [cn (DIZ, where C(t) is the coefficient of the expansion 


= | 
Y 2 CmYm 


and Y, is the eigenfunction of a certain operator, as the probability of 
measuring the corresponding eigenvalue, (2) the definition of the mean value 


L= futtyar. 


Consequently, the entire structure of quantum mechanics also remains valid. 


§115. The solution of the Dirac equation for a free particle 


As the simplest example of the solution of the Dirac equation let us 
consider the motion of a free particle. We shall seek the solution of the Dirac 
equation for a freely moving particle in the usual way 


J A 
iñ = =(ca-ptmcB)y . (115.1) 





CO =< — p” 


484 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


Substituting the wave function Y = yọ e-iEt/ into (115.1) we obtain the 
equation for the time-independent wave function Wo 


EWo = (captnc2B)g - (115.2) 


We consider further states with a definite momentum and seek the solution of 
eq. (115.2) in the form of a plane wave 


Yo =uelD Th | 


Then for the function u we obtain the equation 


Eu = (ca-ptincepyu . (115.3) 
We write ų in the form 
uy 
E i pas 3 
oe U3 =( 1)? w=( ) w=( Ne (115.4) 
u4 y u2 u4 
Substituting (115.4) into (115.3) and taking into account the representations 
of matrices & and 6 (113.11), we find 
Ew =co-pw' +me2w, (115.5) 
Ew'=co'pw—mc?w' . (115.6) 


Each of the functions w and w’ has two components. 
For the system of linear equations obtained to have a solution it is 
necessary that its determinant reduce to zero 


E~mce? —co: p 
=O. (115.7) 
—ca’p E +mce? 
Evaluating the determinant we obtain 
E2 —m2c4 = +c2(6-p)? . 
The expression (o'p)? can easily be transformed by means of the known 
properties of the Pauli matrices. According to (60.17) we have 
(op)? =p?. (115.8) 
As was to be expected, we arrive at the relation already known between 
the energy and momentum of a particle 


E? =¢?p? +m?ct . (115.9) 








§115 SOLUTION FOR A FREE PARTICLE 485 


The energy of a particle can take on positive as well as negative values. We 
have already discussed this problem in the theory of relativity and have seen 
that within the framework of classical mechanics this fact did not lead to any 
difficulties, since the energy range of width 2mc? is forbidden. Indeed, in 
classical mechanics all variables change continuously and a particle has either 
a positive or a negative energy. Continuous transition from one region into 
the other is impossible. 

In relativistic quantum mechanics there are no grounds for rejecting the 
negative sign. We shall discuss in detail the meaning of the negative sign of the 
energy later. 

Choosing the plus or minus sign for the energy we can solve the system of 
eqs.(115.5) and (115.6). By virtue of the homogeneity of the system of equa- 
tions one of the quantities, either w or w’, remains arbitrary. 

Let w be an arbitrary quantity. Then 


,_ co'p 
E +mc? 


If, on the contrary, the quantity w’ is assumed to be arbitrary, then we have 


w. (115.10) 


co: ' 
EOE AA 
E — mc? 


The corresponding wave functions are of the form (for simplicity the direc- 
tion of the momentum vector is here taken to be the z-axis) 


w= (15:11) 


A cp,D 
B E- a 
re Sela 
we EA X v=| E-mc (115.12) 
cp,B D 
E +mc? Es 


Here A, B, D and F are arbitrary constants. The character of these expressions 
will become clearer if we pass to the non-relativistic limit, setting Æ ~ mc? or 
E ~ —mc? respectively. Then from (115.10) it is seen that in the first case 
w= Py wy, 

mc? G 





The spinor w’ is less than w in the ratio v/c, so that u ~ 


oowh 


i 
i 








486 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


For the negative value of energy (115.11) gives 


r 
REID Wa Uy! and v= 


wr 
2mc? G 


nooo 


Thus in the transition to the non-relativistic approximation two components 
of the wave function turn out to be small in comparison with two other 
components. In this case for positive energies w is large in comparison with 
w’, and for negative energies the reverse is true. 

The general solution of the Dirac equation for the motion of a free 
particle can be written in the form of a superposition of wave functions of 
the type (115.12), ie. as a Fourier integral of the form 


W(x,y,z,t) = [ulAB) e-iEt/ ei(p-D/ dp + [vF) etilElt/A ei(P-D/ dp, 


where dp=dp,dp,,dp, - 


If at the initial instant of time ż =0 the following wave function is given 


yı (Pp) 
92(P) 
93(P) 
P4 (p) 


Yy,2,0)= fop) PD dp; g(p)= 


> 


then a given set of quantities y] , 92, Y3 and y4 can be defined unambiguously 
in terms of the four arbitrary coefficients involved in u and v. 

Thus a set of two waves, one of which corresponds to the positive energy 
and the other to the negative energy, forms the total solution of the Dirac 
equation. It is clear that if the particular solution corresponding to the 
negative energy were rejected and only the solution with the positive energy 
were retained, then the system of functions found in this case would be 
incomplete. 

The initial conditions contain four given quantities, whereas u involves 
only two indeterminate constants A and B. Thus irrespective of other 
considerations the necessity of taking into account solutions with negative 
energy follows from the general foundations of quantum mechanics. 

In the next section we shall come back to the discussion of the funda- 
mental conclusions which have been drawn from the existence of solutions of 
the Dirac equation which correspond to a negative energy of the particle. 








§116 THE POSITRON 487 
§116. The concept of the positron 


We now turn to the discussion of the formula 
E=+(p2c2+m2c4): . (116.1) 


As has already been pointed out, from the point of view of classical 
mechanics the negative energy of a free particle has no physical meaning. 

In quantum mechanics the situation is different. Namely, discontinuous 
transitions are possible from states with a positive energy into states with a 
negative energy. In other words, these two classes of states are no longer 
separated by an impenetrable barrier. We have already seen that the exclusion 
of states with a negative energy contradicts the general propositions of 
quantum mechanics, since the wave functions of states with a positive energy 
do not form a complete system of functions. 

On the other hand, it is impossible to assume the existence of particles 
with a negative energy. Such particles would possess properties which differ 
fundamentally from those of all particles observed in nature. As an example 
we can point to the following: a particle with a negative energy —|E,| could 
make a transition into a state with a lower negative energy —|£5|, | >| > \E)\- 
Then the difference |£,| — |E} | could be converted into useful work. Such a 
transition could be carried out continuously, since |£ | is in no way limited, 
and a particle with a negative energy could serve as an infinitely large source 
of work. 

In order to avoid difficulties associated with the introduction of observable 
particles with negative energy into the theory, Dirac introduced the concept 
of the vacuum as that state of space in which all states with negative energy 
are occupied by electrons and all states with positive energy are free. Accord- 
ing to the Pauli principle, there is one electron in each state with a negative 
energy. 

We assume further that under the influence of an external action one of 
the electrons is removed from a state with a negative energy. The vacant state 
with negative energy manifests itself as ‘something’ with a positive energy, 
since for the destruction of such a state, i.e. for its occupation it is necessary 
to add to it an electron with a negative energy. Thus the vacant state with a 
negative energy should be treated as a particle having a positive energy. 

It should be noted that Dirac at first incorrectly assumed that this state 
corresponded to the proton. Subsequently it was shown theoretically that 
the particle corresponding to a vacant state with a negative energy must have 
a mass equal to the mass of the electron and, consequently, it could not be 
the proton. 








488 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


Let us consider in more detail the considerations of Dirac about the 
occupied background of negative energies. Let NO? (p) and NMC) denote 
the numbers of electrons which are respectively in states with a negative and 
a positive energy and have momentum p and a definite spin orientation. The 
index œ can take on two values according to the direction of spin. These 
numbers, in accordance with the Pauli principle, can take on only the values 
0 or 1. In the vacuum state (the index v) we have 


NOO)=1, — N&)=0 


for all momentum values. 

Indeed, all states with negative energy are then occupied, and all states 
with positive energy are vacant. The energy Æ, and charge q, in the vacuum 
are defined by the relations 


E, =- © ENQ), (116.2) 
œp 
ay =- eID ND). (116.3) 
ap 


Here e is the charge of the electron. 

Since the momentum and energy of free particles are in no way limited, 
the values of £, and q, are infinitely large. However, according to Dirac, these 
quantities are in principle not observable. Only those quantities which charac- 

" terize the departure from the vacuum state are observable. 

Further, we write the total energy Æ of the system and the charge q of 
the system in the case where there are in space electrons in states with positive 
energy while there are vacancies in states with negative energy 


E= L [NP p) -NO P) EP), (116.4) 
a,p à 
q = -È lel INP ONO P) - (116.5) 
a,p 


In correspondence with the above only the following differences are observ- 
able: 








§116 THE POSITRON 489 
E-E, = 2) NP PHN QR P) -NO PIEP), (116.6) 
a,p 
q-q, = lel D (NO pH p- p)] « (116.7) 
a,p 


From formulae (116.6) and (116.7) we see that if a certain state with a 
negative energy is vacant, i.e. Np) = 0, then it corresponds to a positive 
contribution to the observed values of.energy and charge. Indeed, formulae 
(116.6) and (116.7) involve the expressions New’ (P)— No P). If a state with 
a negative energy is vacant Np) =0, then NS (p)-N©(p) = 1. In this 
case a positive contribution to the energy and charge of the system arises, 
equal respectively to £, and |e]. Thus we see that the absence-of an electron 
with momentum p in the continuous background of occupied negative states 
is equivalent to the appearance of an observable particle with a positive 
charge, positive energy and momentum —p. Such a particle, with charge 
(+le|) and a mass equal to the mass of the electron, was called a positron. It 
was discovered by Anderson in cosmic rays a few years after the appearance 
of Dirac’s theory. 

Proceeding from Dirac’s concepts it turns out to be possible to account 
for a number of known physical effects. For example, it is obvious that an 
electromagnetic field can produce an electron—positron pair if the energy 
of the photon Aw is greater than 2mc?. This energy is necessary for bringing 
an electron from a state with a negative energy into a state with a positive 
energy. 

The laws of conservation of energy and momentum restrict the possibility 
of the reaction of electron—positron pair production by a photon. In fact this 
reaction can take place only in the presence of a third body — for example a 
nucleus, which takes a part of the momentum. In addition to electron—posi- 
tron pair production the inverse reaction, positron annihilation, is possible. 
In the annihilation an electron with a positive energy makes a transition into 
a vacant state with negative energy. The difference between the energies is 
emitted in the form of a y-quantum. 

Dirac’s theory made it possible to predict not only these phenomena but 
also to calculate the cross section for both processes. The excellent agreement 
of the results of the calculations with experimental data was a strong con- 
firmation of the validity of Dirac’s ideas. However, the last decade has been 
marked by very important theoretical and experimental achievements, which 
will in part be elucidated in what follows. These successes allow one, on the 


na e 


i 
f 
Í 
f 
| 


ements) 








490 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


one hand, to show the reality of the existence of the vacuum in the Dirac 
sense, and, on the other hand, to extend the regions of applicability of 
relativistic quantum mechanics. As has already been pointed out, the vacuum 
represents a system of charged particles occupying all possible states. When an 
external electric charge is brought into the vacuum, or when an electro- 
magnetic field arises, the vacuum begins to interact with the external fields. 
For example, Lamb discovered in 1953 that the levels 2S) and 2P, of the 
hydrogen atom have somewhat different energies (the Lamb shift). This effect 
can be accounted for only by an interaction with the vacuum (see §119 and 
§128). 

Thus the concept of the vacuum developed by Dirac is confirmed by a 
number of diverse experiments. 

The symmetry of the theory with respect to electrons and positrons finds 
its expression in the fact that there exists a unitary operator Ô, called the 
charge conjugation operator, which transforms a particle into its antiparticle. 
In other words, the action of the operator Ô exchanges an electron and a 
positron (of the same spin and energy). 

If We and Yp denote respectively the wave functions of an electron and of 
a positron, then by definition we can write for them (see (1 13.17)) 


AARE 
[ru (6.-EAu) ime] YW, =0, (116.8) 


re. À 
[vu (8.+£4,) -ime | W, =0. (116.9) 


Then the function YZ which is complex-conjugate to , satisfies the equa- 


tion 
* A e * J I * =0 
Yi Dagu tinc| We A 


or, since 


A 


Bi =s A =A DD E 
we have 
Affix .@ eC é 2- 
[i (batea) i (Prc) ~me] we =0. (116.10) 
From the comparison of (116.9) and (116.10) it is natural to assume that 


Wp=CWe, wi =Cly,. (116.11) 








§116 THE POSITRON 491 
Substituting (116.11) into (116.9) we have 
apne ‘eee : Aps 
[r (a+£41) +74 (ba $44) ~me] CUSO 
Multiplying this equation from the left by Ĉ-L, we have 
oa [x (Pirta) +74 (P+ 54s) ime] Cyt =0, 
or 


[ (Pt es) Cayce (batta) Ĉ-144ĉ-imc| w% =0. (116.12) 


For (116.12) to be identical with (116.10) it is necessary that the following 
equalities be fulfilled: 

ClyC= yf, CETE 
If y; and y4 are defined by formulae (113.14), then 


omo Ome 
a OF OTETO 
CyS 

oz 0 0 

-1 0 0 o0 


From definition (116.11) it is seen directly that the operator Ĉ commutes 
with the Hamiltonian. Thus one can introduce two wave functions Ye and Yp 
which are completely equivalent and interrelated by the following relations 
conserved in time: 


Wp=CWE=Ve, We-CWh=1'v5. 

The two wave functions describe particles of the same (positive) energy, mass 
and spin, but with different signs of the charge and of the magnetic moment. 
The introduction of charge-conjugate wave functions for equivalent particles 
removes to a certain extent the logical difficulties associated with the simpli- 
fied interpretation of the vacuum as a background filled by particles of 
negative energy. 

In following chapters we shall describe in more detail the modern theory 
of fields and of elementary particles. 








492 RELATIVISTIC QUANTUM MECHANICS Ch. 13 
§117. The spin of particles described by the Dirac equation 


Although we have so far used the concept of the spin of a particle ex- 
tensively, the spin operator was introduced in a purely formal way, as a 
necessary tool for the description of experimental data. We shall now show 
that the existence of spin follows directly from the Dirac equation. For this 
we shall consider the conservation laws resulting from the Dirac equation. 

Since in Dirac’s theory all the general propositions of quantum mechanics 
are preserved, to find the conservation laws it is only necessary to find the 
commutator with the Hamiltonian. The difference from Schrodinger’s theory 
lies in the fact that the Hamiltonian now has the form of (113.7). 

If the Hamiltonian does not depend on time (and for this it is necessary 
that the external field potentials be time-independent), then the energy 
conservation law holds. In this respect there is no difference between the 
Schrödinger and the Dirac theories. 

For a particle moving in the vacuum the total angular momentum must 
also be conserved. Hence there must exist a total angular momentum operator 
commuting with the Hamiltonian. 

An interesting result is obtained when the operator for the orbital angular 
momentum operator Lan p is commuted with the Hamiltonian. 

For our purposes we restrict ourselves to the case of a free particle. 
Choosing an arbitrarily oriented z-axis, we have 


AL, - LA = (ca:ptmep)L, - L(cap+mc?s) z 


Since the operator 


Reale ð ð 
ae Vea xx) 


commutes with the operators B and a,P,, we then obtain 
AL, —L,H =ca,(p,L,—-L,py) + ca, (P ,L,—L Py) 5 (117.1) 


Making use of the property of commutation of the momentum components 
with the angular momentum components, we find 


AL, =. A= ihc(a i= Py) - (117.2) 

We obtain analogous results for the other momentum components. 
Thus the orbital angular momentum is not a constant of the motion and 
is not conserved. To find the quantity playing the role of the total angular 


momentum we introduce the operator j= L+8, where $ is an unknown 
operator. We require that the operator J commute with the operator H: 





mm 


§117 SPIN DESCRIBED BY THE DIRAC EQUATION 493 


AA 


AJ;-J;H=0 or (AL,-L; A) + (Ås; A) =0. 


Substituting ¿=z and using the value of the commutator (117.2), we have 
HS, —§,H = ific(a,Py — ay Px) 3 (117.3) 


We try to satisfy this equation, setting 
S = Aaa, , (117.4) 


where A is an unknown constant. 
We further calculate the commutator Hs, —$,H. Making use of (113.5), 
we obtain 


Aĝa a, — Aaa, = 
= Acay Êy ta y) OO Aca, O (Py. tapy) = 
= 2Ac(a,Px —a, Py) . 


Comparing this expression with formula (117.3), we find thé value A = — 4 iñ. 
Thus the operator s, is equal to 


aee Oe) pee 
id 2 ery 2 OG, © Oy 0 2\o Oxy 


Lo O @ © 


ih (io 0 o, O 7\/0 -l1 O 
--F( í )=3( jet (117.5) 
2\Q ite A0 Gi 20 © f 
© 00 = 


The two other vector components $, and $, are obtained from analogous 
calculations 


e LEET s ay E 
S = —21ħa,a, ; Sy = —ziha, a, . 
We now find the operator $ = s2 +8 +32. Making use of the properties of 
the operators a,., Qy, Qz, We find 
$2 =302f=AA(+). (117.6) 


We now turn to the discussion of the results obtained. It is evident that the 
quantity J, conserved in time, should be considered as the total angular 
momentum of the particle. In its turn the total angular momentum is the 
sum of the orbital and the intrinsic spin angular momenta of the particle. 
The operators $, and $2 in formulae (117.5) and (117.6) are brought to 


ae i Be) w 





494 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


diagonal form. The spin component along the z-axis can then take on the two 
values +}. The eigenvalues of the operator $? have the form of f2s(s+1), 
where s = 4. Hence it is obvious that the particle has spin 4h. 


§118. The transition from the Dirac equation to the Pauli equation. The 
magnetic moment of a particle 


‘Let us now see how the Dirac equation transforms if the transition to the 
non-relativistic approximation is carried out in it. We shall consider the 
general case where the particle moves in an external electromagnetic field, so 
that the Dirac equation has the form of (113.12). Just as in the limiting 
transition in the scalar relativistic equation, we first of all separate the rest 
energy, i.e. we carry out a transformation of the form 


y= Wy! eime? t/h $ 


Then for the function y’ we obtain the equation 


i Oy [cæ (B-£ A) +me?@-1)+eo] y’. (118.1) 
If the wave function is written in the form y’ = E , then just as for a free 
particle we obtain the equations for w and w’: 

ðw 


: a 3 ARE , 
ih ry} =co (6 A) w +epw , 


+ 
no co: (6-£) w—2mc2w' + egw’ . 
ðt @ 

As always, the limiting transition to the non-relativistic approximation 
corresponds to a formal expansion in powers of c. We assume at first that in 
the general case of the motion of a particle in a field, as well as for a free 
particle, w' ~ c~!w. Then in the second of eqs. (118.2) we can disregard the 
terms iħ(ðw'/ðt) and eyw’, since they are small in comparison with the 
quantities 


(118.2) 


e 
co: (8-54) w and me?w', 
which are proportional to c. We then obtain for the spinor w’ the expression 


Piller i, 
w= (8 A) w, (118.3) 








§118 TRANSITION FROM DIRAC’S TO PAULI’S EQUATION 495 


which is in agreement with our assumption. 
Substituting (118.3) into the first of eqs. (118.2) we find 


paw a (e-a) 


ETS Sn wtegw. (118.4) 


We evaluate the square of the operator in the explicit form 


2 2 
xe os eee Cua GI, xa se a 
[=-(P- = a)] = [ox (a,- ZD) +o, (0,- a) +a, (0.- <,)| f 


In multiplying it should be recalled that the operators § and A do not com- 
mute with each other. Carrying out the multiplication we find 


[(0-£4)]'- 


2 2 
A (EEG Er y A A 2 (e, -2 
02 (0,-£4,) +03 (Ay Fr) +o; (Pz cA: 


N 


eat? Rowe wer’ me 
+ 0,0, (A<- Ax) ( Za oA ) + 0,0, ( TIA <4.) ( bo 4x) 
ee Ae ane ASE 
+0,0, NN $y) (2.- -£4,) +0,0% Cs =) o <Ay) . (118.5) 
2 
y. 
that the sum of the first three terms.is brought to the EAA 
2 2 2 2 
AS NINE: AE SARE 
Ox (8.-$4x) to; (2, ea) +0; (2. <4.) (ô a a) 
We carry out the further transformation only with the terms 


ume, ne e n  @ 
roy (8.~€4s) (,-4y) + 0,0, (8, cas) (2an) ; (118.6) 


since the remaining expressions transform in a way analogous to (118.6). The 
matrices o, and Oy anticommute and, consequently, expression (118.6) can 
be rewritten in the form 


= Cling [-B,A,—-A,P,*PyA, +A Py] : (118.7) 








496 RELATIVISTIC QUANTUM MECHANICS x Ch. 13 


Making use of the commutation properties of the operators p, and Py with 
the operators depending on coordinates (26.10), we have 


ðA 0A ief fs 0A ] 
3 a y| _ieħ y oa | 
< O40, [7 T +ifi | a 04.0, ax oy 





=i oo o (V; X A) =Ë 00%. 


Since, according to (60.16), Ot = ig,, then we finally have 
ieh _ en 
ron 0,0, KH, == z Oz Xz z 


Carrying out analogous transformations with the remaining terms of (118.5) 


btai 
o: -£a) =(p-—A og 8 (118.8) 
4p (6 ) 6 A 


Substituting (118.8) into (118.4) we find 


(6 E a) 

QS 

Ow _ c eħ 

ii — -| Oore s-a] w. (118.9) 


at 2m 


We see that in the transition to the non-relativistic approximation the 
Dirac equation automatically goes over into the Pauli equation. Hence from 
Dirac’s theory it is seen that there results, not only the existence of the spin 
of particles (equal to 4%), but also the existence of the intrinsic magnetic 


moment of particles 


eh 
Roma (118.10) 

We can now define more precisely the problem as to what are the particles 
having spin $fi to which the Dirac equation can be applied. If m is understood 
to be the mass of the electron, then good agreement is obtained between the 
calculated and measured values of the magnetic moment. 

Thus the Dirac equation describes the behaviour of electrons with a high 
degree of accuracy. The Dirac equation also makes it possible, apparently, to 
describe well the properties of the neutrino, a particle with rest mass m = 0 
(see § 123). 

However, attempts to apply the Dirac equation to heavy particles of spin 


3, the proton and the neutron, have not led to very satisfactory results. On 








ner 


§119 ? THE HYDROGEN ATOM 497 


the other hand, it has also been possible to obtain some general and very 
important conclusions from the Dirac equation for heavy particles. 

It turns out that the behaviour of fast protons and neutrons described 
qualitatively also fits the framework of the Dirac equation. Of particular 
importance is the fact that the basic idea of Dirac’s theory, the existence of 
antiparticles, has received direct confirmation for mesons as well as for 
nucleons. 

The antiproton p, a particle with negative elementary charge and a mass 
equal to that of the proton, was discovered in 1955 in the reaction p +p > 
p + (ptp) + p using an accelerator. Somewhat later the reaction p +p>n+n 
with the production of antineutrons was observed. The antineutron differs 
from the neutron in the sign of its magnetic moment and in its parity. When 
antiparticles are annihilated other particles are produced. For example, when 
protons and antiprotons are annihilated m- and K-mesons are produced. 

In spite of all these facts, quantitative calculations and, in particular, 
calculations of the magnetic moment are in disagreement with experimental 
data. If m in formula (118.10) is assumed to be the mass of the proton, then 
a value differing from the experimental value by a factor of 2.7 is obtained 
for its magnetic moment. 

This disagreement of theory with experiment is apparently associated with 
the fact that heavy particles, protons and neutrons, interact strongly with the 
meson field. Herein lies their difference from electrons, which interact 
relatively weakly with the electromagnetic field*. 


§119. The hydrogen atom in Dirac’s theory 


Although the motion of the electron in the hydrogen atom corresponds 
to non-relativistic velocities, finding the relativistic corrections to the hydro- 
gen energy levels was of great interest, since Schrödinger’s theory could not 
account for the appearance of fine structure in the hydrogen spectrum. 

In §38 it was found that the energy levels of the hydrogen atoms depend 
only on the principal quantum number. However, experiment shows that the 
principal quantum number characterizes the energy levels only approximately. 
In reality the excited levels are split into close sub-levels. As a result a split- 
ting of the spectral lines, clearly observable in an ordinary spectrometer and 


* For more detailed considerations on the possibility of applying the Dirac equation 
to nucleons see A.I.Akhiezer and V.B.Berestetskii, Quantum electrodynamics (Inter- 
science Publishers, New York, 1965). 


> 





498 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


particularly accurately measured by means of modern radiospectroscopic 
methods, was observed in the hydrogen spectrum. It turns out that this 
splitting of levels is associated with the spin—orbit interaction and that it 
follows from Dirac’s theory. 

The Dirac equation for stationary state motion in the Coulomb field is of 
the form 


2 
[eaBrmerply = (e+). 


The Dirac equation, as well as the Schrodinger equation, allows exact 
solution for the Coulomb field. However, in contrast to the Schrodinger 
equation, the Dirac equation does not lead to distinct laws of conservation 
of total angular momentum (see §117). Calculations show that only in the 
non-relativistic approximation can one speak of constant values of the orbital 
and spin angular momenta. In this case it turns out that the Hamiltonian 
assumes the form* 

=P sus LIW (3-£)+0(5), (119.1) 

2m Imc2 T Or c2 
where the first two terms are the same as the Hamiltonian of the Schrödinger 
equation, and the third term represents the spin—orbit interaction energy. 
Terms of the order of 1/c?, which are not written out, contain relativistic 
corrections to the kinetic and potential energies which do not have an ob- 


vious interpretation. 
Solving the Dirac equation leads to the following expression for the 


energy of the electron: 








(119.2) 


PERZ? _ Ze4tm Ey mce? ( an3 ) 


2h2n2 \he} 2n4\G+z) 4 
where j=/ +3 is the eigenvalue of the total angular momentum operator; the 
other quantities have the same meaning as in formula (38.17). The energy 
levels depend now not only on n but also on j. For convenience of comparison 
with non-relativistic results the formula (119.2) was obtained from the 
accurate formula by expansion in powers of Ze2/hic. 

Accidental degeneracy (see §38) is removed, and energy levels with one 
and the same value of n but with different j have different values. However, 
this splitting of levels is very small in comparison with the spacing between 
neighbouring levels with different n. 


* See L.Schiff, Quantum mechanics (McGraw-Hill Book Company, New York, 1949). 


§120 INVARIANCE OF THE DIRAC EQUATION 499 


The degeneracy of states with the same value of j is conserved. For 
example, for n =2 there are the following three states: 2S:, 2P} and 2P3. 
The first two states are degenerate, since they correspond to n = 2 and j = 1. 

Up to relatively recent times it was assumed that Dirac’s theory gave the 
fine structure of hydrogen levels with a very high degree of accuracy. The 
distribution of terms, the selection rules and the intensities of lines given by 
the theory were exactly the same as those found experimentally. It was only 
in 1953 that Lamb, using radiospectroscopic methods for the measurement, 
discovered that the 2S: and 2Pz levels have slightly different energies. 

This disagreement between formula (119.2) obtained from the Dirac 
theory and experiment is associated with a fundamental property of matter, 
the reality of the vacuum, and in the end not only does it not contradict 
Dirac’s theory but is one of its most brilliant confirmations. New mathemati- 
cal methods by means of which the Lamb shift was found from the Dirac 
theory will be described in ch. 14. 


§120. The invariance of the Dirac equation with respect to reflection, rota- 
tion and Lorentz transformation of coordinates 


In §113 we have considered some properties of the Dirac equation. Let us 
now show that this equation satisfies the conditions of invariance with respect 
to reflection, rotation and Lorentz transformations. The rotation of the 
spatial system of coordinates and the Lorentz transformation are linear and 
orthogonal transformations. We can write them in the form 


Xp =A Xp ae Chika 5 Oth = Ohne (120.1) 
We find the transformation of the wave function 
W'=Sy, (120.2) 


which leaves the Dirac equation invariant under linear transformations 
(120.1). The transformed wave function satisfies the Dirac equation 


ty oy ME y=. (120.3) 
H 
The derivatives a/ax;, can be transformed by means of the relation 
a _ ð x ð 
ax, = ax, ax! za ax, (120.4) 
H 


Making use of (120.4) we transform eq. (120.3) into the form 


S a 


i J) 2 


500 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


ow mce 
uvas Jx, SY — 0. (120.5) 


If there exists a matrix S~! for which the conditions 
SlayyS=y of SIyS= Vayr,, S-ls =1 (120.6) 
v 


are fulfilled, then, multiplying eq. (120.5) by the matrix S~!, we arrive at 
eq. (113.16). 

Let us now find the explicit form of the matrix of the linear transforma- 
tion S for rotation of the spatial system of coordinates and Lorentz trans- 
formations. In the case of the rotation of the system of coordinates in the 
(xx )-plane the coefficients a „ are defined by the relations 


Xx} =x] cosy +x sing, 


1 (120.7) 
x2 =—xX, sing +x, cosy. 
We shall now show that if the matrix S is chosen in the form 
S= eim , (120.8) 


then relations (120.6) are fulfilled. For this we expand the exponential in a 
series 


2 3 4 
Aa g g 
Sault ra (1172)? sr (1172)? tare oie) Han. 


Further, making use of the expressions 
(11.72)? = 11727172 = 17122 = —! , 
Mn) = MNP EEN 
Mn) = (172)? 172)? = 1, 


we find 


2 4 
= g g g? y 
s= (1- $5 tar- #17 ($- 318 ce (202) 


It is easily seen that the matrix S is equal to 
S=cossp +y; y sinty. (120.10) 


Let us now check relations (120.6). The equality S-!5 = 1 is obvious. Let us 
find, for example, the expression 





ee EEE 


§120 INVARIANCE OF THE DIRAC EQUATION 501 


Slys = (cost y—y 172 sin $y (cos yty; Y2 sin zy) - 
Using the properties of the y-matrices and elementary trigonometric formulae, 
we find 

Slys = y; cosy +72 sing =4]1Y] + 41972 = 44,7, > 


which is in complete agreement with (120.6). 

We now turn to Lorentz transformations. According to §10 of Part II, 
the Lorentz transformation can be treated as a rotation through an imaginary 
angle y = ix in the (x; xq )-plane: 


x, =x, cosh x— xo sinh x , tanh x = v/c ; 
a. Una 
c(1—v2/c? yr 


The matrix S can be found in analogy to (120.8), replacing the angle y by 
ix. Then S assumes the form 


xo =—x, sinh x + xo cosh x , sinh x = 


S = e2!X7174 = cosh tx + iy; 7%q sinh4tx . (120.11) 
Besides the rotation transformation and the Lorentz transformation it is 
necessary to consider the transformation of inversion in the origin. Under 


the inversion of coordinates the spatial coordinates change according to the 
formulae 


x) >x] > x2 > —xX), x3 > -X5, X4=X4 - (120.12) 
We have to require that the equation 


3 


a l} Ut 7 
Dy; 4 +14 d mey ue 
i 9%; 0x4 


remain invariant under the replacement (120.12), and that the wave function 
undergo the transformation y’ =fy. It is easily seen that the requirement of 
invariance will be fulfilled if the operator Tis of the form 


T=a8, (120.13) 


where a is a certain number. Indeed, making use of (120.12) and (120.13) we 
obtain 





502 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


Multiplying this equation from the left by 6 and dividing it by a, we obtain 
eq. (113.16). 

A double inversion transformation brings the system back into its initial 
state i.e. it corresponds to rotation over the angle 27. In the last case may 
change sign. Hence we find the condition imposed upon the quantity 


a2=+). (120.14) 
For what follows we shall need the laws of transformation of the function 
=Wiyg - (120.15) 


They can be introduced if it is noted that the function yi satisfies the equa- 
tion 
3 


awit mes Ovi 
21 ay, 7 Reape Or (120.16) 


The requirements of invariance of this equation with respect to the rotation 
of the spatial system of coordinates and the Lorentz transformation lead to 


the condition 
y=vs!. (120.17) 


Under the inversion transformation we find Y’ = a* YY. 


§121. The laws of transformation of bilinear combinations made up of wave 
functions 


Later, in discussing one of the basic problems of modern physics, the 
problem of the interactions between elementary particle, we shall have to 
make use of certain properties of bilinear combinations made up of the wave 
function y and the function Y conjugate to it. 

As we shall see in what follows, it is necessary for a relativistically invariant 
formulation of the laws of interaction of nuclear particles to know the laws 
of transformation of the bilinear combinations of the quantities mentioned 
under the Lorentz transformation, spatial rotation and inversion. A simple 
calculation shows that from the components of the wave function and the 








§121 BILINEAR COMBINATIONS OF WAVE FUNCTIONS 503 


y-matrices one can construct certain bilinear combinations which possess the 
following transformation properties: 


ww one component (scalar) , 

Wry four components (4-vector) , 

yis Y four components (pseudovector) , 

PYY Y six components į # k (4-tensor of the second rank) , 


Vysv one component (pseudoscalar) . 


Here the following notation is introduced: 


ke 01 0 
VWOp SOE Cae ae p $ 5 a (121.1) 
a JOO 


The quantity ys has the following properties: 
2 = 
MG = Ne Msn UGB —O- [EW 2s ShsGe 


The validity of these relations is easily seen by direct check. We turn to 
the proof of these transformation properties. When a Lorentz transforma- 


tion and a spatial rotation are made, one can write by virtue of (120.2) and 
(120.17) 


V'y' = YS-'sy = dy . 


Because W'p'=a*Pygay4w = Vw the quantity Py also remains invariant 
under reflection of the system of coordinates. Thus we see that the quantity 
Vw is invariant with respect to an orthogonal transformation. 

Further, we shall show that the four quantities Ww transform as the 
components of a four-dimensional vector. When the rotation of the system 
of coordinates and a Lorentz transformation are carried out we can write 


Wy = YS lysy . 
In correspondence with formula (120.6) we find 

Wy! = aY Y - (121.2) 
When inversion of coordinates is carried out we obtain 

Wy! =a aba rival = Vr (i#4). (121.3) 


Thus the quantity vy changes sign under the inversion. Formulae (121.2) 








504 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


and (121.3) show that the four components indeed form a four-dimensional 
vector. 

We shall now show that the quantity Wy; represents a pseudoscalar. 
Under inversion of coordinates we have 


Y'ysY' =a" a4 172737474 - 


Using the form of the matrix ys and the commutation property of the 
matrices Y;Yk + Yk Yi = 25;,, we easily find 
W'y5v' = —W7s5 Y, 

which proves the statement which we have already formulated. The quanti- 
ties Wys Y do not change sign under reflection, and transform as the 
components of a vector under the rotation and Lorentz transformations. 
Consequently, we can state that these quantities are the components of a 
four-dimensional axial vector or pseudovector. 

We can convince ourselves of the tensor character of the quantities 
WyYiYk Y in an analogous way: 


Wyre! = US 774 SV = PSY; SS I YkSY = GiemVVI mY > 


which is the same as the definition of a tensor. 


§122. The concept of weak interactions. Parity non-conservation 


We have already seen that in addition to the electromagnetic interaction 
there is also another form of interaction; the strong interaction between 
nucleons. 

It turns out that in addition to the strong interaction there is one more 
form of interaction which is also of non-electromagnetic character and is 
called the weak interaction (see below, § 130). 

Weak interactions, which cannot bind nucleons in the nucleus, play an 
important role in the physics of elementary and nuclear particles. They are 
responsible for the radioactive decay of nuclei with the emission of light 
particles, electrons and neutrinos. In other words, the weak interaction 
between elementary particles leads to B-decay. 

The theory of weak interactions has recently achieved considerably 
successes. However, a consideration of the relevant problems is possible only 
within the framework of the quantum field theory and hence we shall confine 
ourselves only to some comments. First of all we note that the Dirac equa- 
tion (113.7) can be considered as the equation for a certain electron—positron 


—— 


§122 WEAK INTERACTION. PARITY NON-CONSERVATION 505 


field y. We have already mentioned such a field approach in §112 where we 
considered the Klein—Gordon—Fock equation. In the field description par- 
ticles are considered as the excitation quanta of the corresponding field (for 
example, photons are the excitation quanta of the electromagnetic field (see 
§101 and §102)). Then the function y should be considered as an operator 
in the space of occupation numbers (see formula (99.26) of the theory of 
second quantization). Of course, passing to the ‘field’ description we give up 
the one-particle interpretation of the Dirac equation. The operator y has non- 
zero matrix elements corresponding to the absorption of an electron and the 
production of a positron, whereas the operator yt has non-zero matrix ele- 
ments corresponding to the production of an electron and the absorption of a 
positron. Such considerations are general and apply also to other particles 
(u-mesons, neutrinos, nucleons and so on). 

Let us now consider any process, for example the decay of a u-meson, with 
the emission of a neutrino and an antineutrino 


Mometvt. 


We recall that by definition the neutrino is understood to be the particle 
emitted in the positron decay of the proton 


p>oettntp, 
and the antineutrino the particle emitted in the B-decay of the neutron 
n>pte +p. 
Experimental data available at present show that these particles are not 
identical. 
The process of decay of the u-meson involves four particles with spin 4, 
four fermions. 
For the description of the u-meson, the electron, and the neutrino we 
introduce respectively the operators Yp We and Y, each of which satisfies 
the corresponding Dirac equation. The basic problem consists now in 


choosing the interaction leading to the decay. For this it is necessary to 
formulate the interaction Hamiltonian 


= f'ar, (122.1) 


where His the density of the interaction Hamiltonian. 

Since the Y5 are operators in the space of occupation numbers, the 
density of the interaction Hamiltonian must contain these operators, as in 
non-relativistic physics (§99). 

From the structure of expression (122.1) it is seen that the density of the 








506 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


interaction Hamiltonian A’ (we shall sometimes omit the word ‘density’) 
must be a relativistic scalar (invariant with respect to the rotation and Lorentz 
transformations). Until the mid-fifties there were no doubts as to the exis- 
tence of symmetry with respect to ‘the right’ and to ‘the left’, i.e. it was 
assumed that the parity conservation law holds for all interactions. Hence it 
was assumed that the density Al’ must also be invariant with respect to the 
inversion transformation. The requirement of relativistic invariance strongly 
restricts the class of possible expressions for Hi’. Namely, since in the theory 
of relativity any interaction has the character of a short-range action, the 
values of the characteristics of all particles (the operators Y) must be taken 
at one point of space and at one instant of time. 
For the process of 8-decay involving four fermions 


A+B>C+D. (122.2) 
Fermi proposed the simplest law of interaction in the form 
A’ ~ GP, ) (YprYp) + Herm.conj. , (122.3) 


where Herm.conj. denotes the Hermitian conjugate expression, and the value 
of all operators W; is taken at one point. The quantity I’ can have the follow- 
ing forms: 


re scalar covariant , 
ty = Ma vector covariant , 
10 = Crs) tensor covariant , 
Ty = ps pseudovector covariant , 
D; =7¥5 pseudoscalar covariant , 
where Gin A O u, v=1, 2, 3,4, and the summation in (122.3) 


is carried out from one to four over repeated vector indices. 

The Hamiltonian density (122.3) does not involve the derivatives of the 
operators y and wW. This form of the interaction Hamiltonian is called 
‘coupling without derivatives’. We shall come back to the problem of the 
absence of derivatives in the law of interaction below. 

We have, in §121, established the transformation properties of bilinear 
combinations of the type (WT). Since the operator H' contains products 
in which the quantities I are involved twice, it is a scalar for all r. Thus, for 
example, for the vector covariant of the interaction we have 


A’ = 2.0% VV 7B) + Herm.conj. , 


where the constant g, is called the coupling constant or the interaction con- 


§122 WEAK INTERACTION. PARITY NON-CONSERVATION 507 


stant of the vector covariant. The vector covariant is constructed as the scalar 
product of two four-dimensional vectors (the summation from 1 to 4 is 
carried our over u). The addition of the Hermitian conjugate terms makes the 
operator Hermitian. 

In the general case the Hamiltonian density represents the sum of all five 
types of interaction. The expression written satisfies the requirements of the 
theory of relativity and, besides the characteristics of the particles, the 
Operators Y, contains only the interaction constant and the matrices involved 
in the Dirac equation. 

We now come back to the process of decay of the u-meson with the emis- 
sion of a neutrino and an antineutrino. Since the operator yY, describes the 
emission of an antineutrino as well as the absorption of a neutrino, the 
process of decay of the -meson is equivalent to a process with the absorp- 
tion of a neutrino 


utvretpv. 
Correspondingly H’ is of the form 


5 
A’ = 20 gerk VT, V p) + Herm.conj. . (122.4) 
k=1 


The use of the Hamiltonian (122.3) led to some success in the construction 
of a theory of B-decay. As will be clear from what follows, the Hamiltonian 
(122.4) became a basis for working out the modern theory of 6-decay. 

The further considerable development of the theory was associated with 
the discovery of parity non-conservation in weak interactions. The assump- 
tion of parity non-conservation in weak interactions was made by Lee and 
Yang* on the basis of available data on two types of K-meson decay. 

K-mesons represent a group of elementary particles (a positive, a negative 
and two neutral ones) having zero spin and a mass of about 966 electron 
masses. All K-mesons are unstable and decay with a lifetime of 1.2X 1078 sec 
for the charged mesons and 10710 sec and 6X1078 sec for the two neutral 
mesons. It turns out that in addition to the decay into a u-meson and a 
neutrino, K-mesons can decay according to the schemes 


* T.D.Lee and C.N.Yang, New properties of symmetry of elementary particles, Phys. 
Rev. 102 (1956) 290; 104 (1956) 254. 


a 





508 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


Kt sont+79 , (@-decay) , 


" nET +7, 
Kane T-decay) . 
nt +79 +79 ‘ st 


The possibility of decay of the K-meson into two or three m-mesons directly 
contradicts the parity conservation law. Indeed, the analysis of the properties 
of m-mesons and of their angular distribution shows that the parity of a sys- 
tem of two mesons differs from that of a system of three mesons. 

Still more definite indications of parity non-conservation were obtained 
subsequently in studying the B-decay of polarized nuclei of 60Co. The nuclei 
of 60Co have a spin ø different from zero. This fact imposes certain require- 
ments upon the angular distribution of the B-electrons emitted by them. 
Namely, it follows from the parity conservation law that the distribution of 
the electrons must possess symmetry with respect to the direction of the 
vector o. The number of electrons emerging at angles © and 180°—0 with 
respect to the direction of ø must be the same. Indeed, if the number of 
electrons entering solid angle dQ is written in the form 


dI = F(6) dQ, 


where F is a certain function of the angle 0 between the vectors p and ø, then 
this relation should not be violated under the inversion transformation. 

The vector o, being an axial vector, does not change, whereas the polar 
vector p changes sign under the inversion transformation. Hence the angle 0 
transforms under inversion: 0 > 180°—0. Thus the parity conservation law 
requires the invariance of the distribution function 


F() = F(180°—6) . 


Direct measurements showed that the angular distribution of the B-electrons 
emitted by polarized nuclei of ©9Co does not possess the symmetry men- 
tioned. On the contrary, the electrons emerge preferentially in the direction 
opposite to the orientation of the spin of the nucleus. Thus the 6-decay of 
polarized nuclei demonstrates directly the violation of the parity conservation 
law. 

The parity conservation law, as we have seen in §33, is associated with 
symmetry properties of space. Violation of parity would mean that space 
possesses no symmetry and that the notions of ‘right’ and ‘left’ in it are of 
absolute character. Such an interpretation would lead to extremely grave 
difficulties in interpretating all the laws of physics. It appeared to be com- 


§123 TWO-COMPONENT NEUTRINO THEORY 509 


pletely incomprehensible how space, could be asymmetric while remaining 
homogeneous and isotropic. 

A way out of this difficulty was proposed by Landau. According to 
Landau’s hypothesis, the particles themselves are asymmetric, not space. 
Landau* proposed the principle of combined parity, according to which all 
physical laws must remain invariant under combined inversion, space inversion 
and the simultaneous replacement of particles by antiparticles (so-called 
charge conjugation). As examples of the latter we can mention the replace- 
ment of electrons by positrons, protons by antiprotons and so on. 

Parity non-conservation in weak interactions leads to the fact that the 
Hamiltonian H’ must no longer necessarily be a scalar with respect to reflec- 
tion. Consequently, in the general case the Hamiltonian (122.3) must be 
supplemented, by introducing into it terms which change sign under the 
reflection of coordinates 


5 


R’ = 20 [2k UcTkVA ple Yp) + 8k GeV EY MVPs Vg] + Herm.conj. 
Pai (122.5) 


The second component of each term of the sum is a pseudoscalar. The 
constants g, generally speaking, are not the same as the constants &,- One 
would think that the increase in the number of constants makes the inter- 
pretation of available experimental data and its comparison with conclusions 
from theory difficult. However, as a matter of fact, parity non-conservation 
opened new possibilities and led to the formulation of the universal law of 
the four-fermion interaction. 


§123. Two-component neutrino theory. The universal four-fermion inter- 
action 


The discovery of parity non-conservation in weak interactions made it 
possible to formulate the theory of the longitudinal or two-component 
neutrino**. The theory of the two-component neutrino is based on the as- 
sumption that the mass of the neutrino is not simply small but exactly equal 
to zero. Since the neutrino has spin one-half, it is described by the Dirac 


* L.D.Landau, Soviet Physics JETP 5 (1957) 336. 
** L.D.Landau, Societ Physics JETP 5 (1957) 336; Nuclear Physics 3 (1957) 127; 
A.Salam, Nuovo Cimento 5 (1957) 299. 


errs ee ee 


nee 





510 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


equation, which for m =0 for states with given momentum p is of the form 
(see (115.3)) 


Eu=(a'p)u , E= żŁįpl. (123.1) 


(In this section and further on we use the system of units in which A = | 
and ¢ = 1.) We can pass from eq. (123.1) for the four-component function u 
to the equation for two-component functions. Setting UERN AA) and 
taking into account that a=(0 O we rewrite eq. (123.1) in the form 


Ew =(o-p)w’ , Ew' = (opw. (123.2) 
Adding up and subtracting eq. (123.2) we obtain 
Eg, = (opps >- Ep_=—(o'pyy_. (123.3) 


where 


P4 -5 (wtw’), y_= gew) ; 


We see that the two-component functions y, and y_ satisfy equations of 
the first order. Of course, if parity were conserved we could not make use of 
the superposition of the functions w and w’, since these functions transform 
differently under the inversion of the system of coordinates. Indeed, since o 
is an axial vector and p is a polar vector, the product (o-p) is a pseudoscalar. 
Then from (123.2) we see that if w transforms under the reflection as a polar 
spinor, w’ transforms as a pseudospinor, and vice versa. 

Choosing the direction of the vector p to be the z-axis, we obtain from 
(123.3) 

EO ae (123.4) 
0,9,=—~, for E=—Ip\ 
and 


o,y_=-—yp_ for E£=\pl, 
a : (123.5) 


o9- =Y- for £=-—I(pl. 


We see that the functions y, and y_ describe states whose polarization (the 
spin component along the z-axis) is unambiguously related to the sign of the 
energy. Thus the function y_ describes the state polarized against the direc- 
tion of the momentum for £ = |p| and along the direction of the momentum 
for E= —|p|, whereas the function p, describes the state polarized along the 
direction of the momentum for £=|p| and against the direction of the 
momentum for £ = —|p|. 





§123 TWO-COMPONENT NEUTRINO THEORY 511 


In the theory of the two-component neutrino it is assumed that the 
neutrino (E=|p|) and antineutrino (E=—|p}).are described by the function y_, 
i.e. that the neutrino is always polarized against the direction of the momen- 
tum, and the antineutrino always along the direction of the momentum. Of 
course, it might equally well be assumed that the neutrino and antineutrino 
are described by the function y,, but this would lead to conclusions which 
are in disagreement with experimental data. If the energy of the antineutrino 
is also assumed to be positive £ = |p|, then the antineutrino will be described 
by the function y, (see the first equation of (123.4) and the second equation 
of (123.5)). 

Under the inversion of the system of coordinates the axial vector ø does 
not change whereas the polar vector p reverses its direction. Consequently, 
the neutrino then goes over into the antineutrino and vice versa, in accordance 
with the ideas put forward by Landau (combined parity conservation). 

The entire conclusion presented is based on the assumption that the mass 
of the neutrino is exactly equal to zero. This also follows immediately from 
the following obvious considerations. If the mass of the neutrino were not 
equal to zero, then it would move with a velocity less than the velocity of 
light. There would then exist an inertial system of coordinates, moving with 
respect to the laboratory system of coordinates, with a velocity larger than 
the velocity of the neutrino, in which the direction of the momentum of the 
neutrino would be reversed. Since the direction of the spin does not change 
under such a transformation, we would have in one inertial system of coor- 
dinates a neutrino, and in the other an antineutrino, i.e. we would arrive at a 
contradiction, since the neutrino and antineutrino by assumption are not 
identical. 

The theory of the two-component neutrino can easily be formulated 
within the framework of the usual mathematical apparatus, i.e. by means of 
four-component functions. Namely, it is easily seen (see (113.16)) that if the 
bispinor w is the solution of the Dirac equation with a rest mass equal to zero 


OW 
Ve ax, 0, 


then the functions Y, and w_ will also be solutions of this equation 
1 1 
Wells, Ves) (123.6) 


or respectively for states with a definite momentum 


- pe) m 


512 RELATIVISTIC QUANTUM MECHANICS Ch. 13 
Jes] apil 
u, = fy 0=Ys)u , u = jy ts) » (123.7) 


where u satisfies eq. (123.1). 

It is easily seen that the functions u, and u_ are expressed in terms of y, 
and y_. Indeed, setting u= 3/2 (8) and taking into account that y5 = 
-(@ wh where the four-by-four matrix is written in terms of two-by-two 
matrices, we obtain 


1 is) 1 /¥+ 
no == f 
K Ailes ales) 


: (123.8) 
mH ESL) 
= 2 V@vawy) V2 Ley 
The function y_ 
WL =u eip ED, (123.9) 


describes the neutrino for E = |p|. It is the eigenfunction of the spin com- 
ponent operator S,, 


a K 0 ) 
s= 2 
Z 0 0z 


corresponding to the eigenvalue —1, 
&u_= -u for E= |pl (neutrino) . 


The antineutrino (£=(p|) is described by the function u, and Y, respectively. 
Under the action of the operator $, we have 


Sin UTO (antineutrino) . 


We also note that the functions y, and wW_ are eigenfunctions of the 
operator y5. Indeed, since y5 = 1, it follows from (123.6) that 


¥5V+=—VW 5 Wawa We o (123.10) 


The operator ys is called the helicity operator. To the eigenvalue y5 = +1 
there corresponds left-handed helicity, while to the eigenvalue y; = —1 there 
corresponds right-handed helicity. The above results can be given an obvious 
interpretation in terms of the helicity operator: there is a strong correlation 
between the direction of the momentum vector and the direction of the spin 
vector of a particle. For the neutrino the spin ø is antiparallel to p (y5=1), 








§123 TWO-COMPONENT NEUTRINO THEORY 513 


P 


(a) 
Fig. V.29 


whereas for the antineutrino it is parallel (y5=—1). If the spin is represented 
in an obvious way as the rotation of the particle, then the neutrino rotates as 
a left-handed helix about the axis p (fig. V.29). Under the space inversion 
the direction of p reverses, whereas the vector ø remains unchanged. There is 
no neutrino having a spin of ‘irregular’ orientation. Hence under the reflec- 
tion of space coordinates it is necessary to allow for the transformation of a 
neutrino into an antineutrino, in correspondence with the principle of com- 
bined parity. 

Gell-Mann and Feynman* put forward the hypothesis that the property 
of helicity is of a general character and is a characteristic of all fermions and 
not only of the neutrino. According to this hypothesis the transformation 
(123.6) should hold for all four fermions involved in the process of weak 
interaction (122.2). This means that in the general expression (122.5) the 
Operators Wo, Wp, W, and Wp, should be replaced respectively by the 
Operators 


1 l 

Xc =J ts We > XA =a +s Wa , 
l 1 

Xp =z Cso » XB =A Us - 


Let us elucidate this assumption. We note, first of all, that the operators x 


* R.Feynman and M.Gell-Mann, Phys. Rev. 109 (1958) 193. 





514 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


are actually two-component operators. They are expressed analogously to 
(123.8) in terms of the spinors w and w’ involved in 


l ( w—w’ ) 
R= 5 : 
2 —(w—w’) 
Since x =4V2(1 475 )w, then, acting on both sides of the equation with the 
operator 





l OL A, 
V2m Yu ax, ag mae 


and taking into account the Dirac equation 


Giso 5 
|ru n iea) m] w=0, 


ia [ru ‘= —ieA a) =m] x=y. (123.11) 


Substituting Y in the form of (123.11) into the Dirac equation we obtain 
the equation of second order satisfies by the operator x 


we obtain 


a 2 
le —ieA i) teom Bip -m?| x=0, (123.12) 
where 
Oy = 1% A) 
and 
tng ED 
Fy = OX, Ave ox, asa, ae 


Feynman and Gell-Mann assumed that the operator x is more fundamental 
than the operator y and thus that the interaction Hamiltonian (122.3) should 
not contain the derivatives of the operator x. Hence, in view of (123.11) the 
interaction Hamiltonian should involve the operator x and not w. 

Such an assumption leads immediately to the fact that of all the co- 
variants of interaction only the vector covariant and the axial-vector covariant 
(with the same constants) turn out to be possible, whereas all other covariants 
give zero. 

Let us show, for example, that the pseudoscalar covariant reduces to zero. 





§123 TWO-COMPONENT NEUTRINO THEORY S15 


AY = 4.9, (#75 WV c75(1 +75 Wa M475 Wp 7s(1 +75) + 
tigi (45 Wes (1 +75 Wa YU 4+75 ¥pUl +75 )WpR) + Herm.conj. . 
But 
¥sUty5)=1 475 « 
Consequently the two terms are identical. Further, 
Atys Y = (+75 WF 74 = YT 475 74 = Y-Y) - 
Hence the following terms appear in the parentheses: 
(ys (+75) = 1-73 = 0. 


The scalar and tensor covariants also give zero. 
Taking into account that 


ysUt75) = (+75), (+75)? = 2( +75), 
we have for the vector and axial-vector covariants 
A’ = 4 e248 )1VoU—15 1 +75 Walp —75 1. 475 Bl + 
ar 1 leata lY cU rs Wys (1475 Wal X 
xX [Ypa —Y5 Ws (l +ys)Y g] + Herm.conj. = 
= 4 (82482 +84 484) X 
X [Yey 2l +75 Wa [Ypy 20 +75 )Y g] + Herm.conj. = 
=fV cr t75 Ya (Upp +ys)Ypg) + Herm.conj. . (123.13) 


On the basis of a careful analysis of experimental data Sudarshan and 
Marshak arrived at exactly the same form of the Hamiltonian for the four- 
fermion interaction. 

The interaction Hamiltonian (123.13) gives the universal law of the four- 
fermion interaction with only one coupling constant f. In analyzing concrete 
processes by means of the Hamiltonian (123.13) it is also necessary to take 
into account the so-called leptonic charge conservation law. Leptons are light 
particles taking part in weak interaction processes, namely: electrons e7, u7- 
mesons and the neutrino v. The particles e*, u* and Y are called antileptons. 
Leptons are assigned the leptonic charge +1, and antileptons —1. For other 
particles, for example nucleons, the leptonic charge is assumed to be equal to 
zero. The total leptonic charge (the algebraic sum of the leptonic charges) 
must be conserved in the reaction (see ch. ]5). 





516 RELATIVISTIC QUANTUM MECHANICS Ch. 13 


Let us consider, for example, the decay of the u~-meson 
uo se t+vtr. (123.14) 


A decay with the emission of two neutrinos (or antineutrinos) is evidently 
forbidden by the leptonic charge conservation law. 

The interaction Hamiltonian (123.14), in correspondence with (123.13), is 
of the form 


A! =fWe7, (1415 )¥,),7, (1 +75)¥,,-) + Herm.corj. . (123.15) 
For the process of B-decay of the neutron we have, correspondingly 
A = fer, 1415 Yy XG py (1415 Wn) + Herm.conj. . (123.16) 


Knowing the interaction Hamiltonian it is easy to determine the proba- 
bility of the corresponding process by ordinary methods of perturbation 
theory. The universal law of the four-fermion interaction, proposed by Gell- 
Mann and Feynman and by Sudarshan and Marshak is quantitatively con- 
firmed by a vast amount of experimental data. 





14 








Some Problems 
of Quantum Electrodynamics 


§ 124. The Green’s function of the Dirac equation 


The theory of the interaction of non-relativistic charged particles with the 
electromagnetic field, presented in ch. 12, is easily generalized to the case of 
relativistic particles. However, calculations of higher approximations of per- 
turbation theory (expansion in powers of e2/hc) led to diverging expressions 
whose physical meaning was not clear. Thus, for example, the intrinsic energy 
of the electron turned out to be infinite, as in classical electrodynamics. 
Corrections to the scattering cross sections calculated in the second order 
approximation of perturbation theory also turned out to be infinitely large, 
and so on. All this pointed to a limited region of applicability of the mathe- 
matical apparatus of quantum electrodynamics. At the same time good agree- 
ment with experimental data on cross sections for different processes cal- 
culated in the first non-vanishing approximation of perturbation theory 
indicated the validity of the general ideas and methods of the theory. 

An increase in the accuracy of experimental methods of investigation led 
recently to the establishment of new facts which had no explanation in 
quantum electrodynamics. Namely, in 1947, in addition to the discovery by 
Lamb of the shift of the 22S: and 2P: levels of the hydrogen atom which, 
according to the Dirac theory, should coincide, Rabi established that the 
value of the magnetic moment of the electron differs somewhat from a Bohr 


517 





518 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


magneton. The discovery of these phenomena led to a further intense develop- 
ment of quantum electrodynamics. Very important roles in the development 
of the theory were played by the studies of Bethe, Feynman, Dyson, 
Schwinger, Tomonaga and others*. In particular, Feynman proposed a new 
method of calculation which made it possible to simplify considerably all the 
calculations and also to give them an obvious physical meaning**. Within the 
framework of this book we can present only the most general outlines of 
Feynman’s method. A detailed exposition of Feynman’s method as well as 
numerous examples of its application to concrete problems can be found in 
the articles and monographs cited below. 

The method of Green’s function is the basis of the mathematical apparatus 
of Feynman’s theory. We now turn directly to the exposition of the method 
proposed by Feynman. First of all we write the Dirac equation in a form 
which is more compact and convenient for these calculations. For this we 
introduce the operator Vv 


where x4 = ixg = it. In this notation the Dirac equation has the form 
(V+m)W =0. (124.1) 


Analogously to what we did in non-relativistic theory for-the Schrödinger 
equation (see §29) we introduce the Green’s function K(2,1) of the Dirac 
equation (124.1). The Green’s function K(2,1) by definition satisfies the 
equation 


(Êz +m)K(2,1) =i7164(2,1) . (124.2) 


Here and in what follows the numbers 1 and 2 denote the set of four coor- 


* A detailed bibliography is given in the book of S.Schweber, H.Bethe and F.de 
Hoffman, Mesons and fields (Row, Peterson and Company, Evanston, Illinois and White 
Plains, New York, 1956). For a detailed exposition of quantum electrodynamics see also 
A.l.Akhiezer and V.B.Berestetskii, Quantum electrodynamics (Interscience Publ., New 
York, 1965). 

** R.P.Feynman, Phys. Rev. 76 (1949) 749. See also the monographs cited above. 

T We use a notation somewhat different from that introduced by Feynman. A similar 
notation is adopted, for example, in the book of A.I.Akhiezer and V.B.Berestetskii. We 
also note (as on p. 510) that in this chapter we assume that ñ = 1 and c= 1. 


§124 GREEN’S FUNCTION OF THE DIRAC EQUATION 519 


dinates x, and V, is the operator acting on the variables X2: The symbol 
64(2,1) denotes a 4-dimensional 5-function equal to 


64 (2,1) = 54 (xy—x,) = 6(ry—1 )5(t—-2,). (124.3) 


We seek the solution of eq. (124.1) in the momentum representation. In 
other words, we expand the function K(2,1) in the Fourier integral 


co 


K(2,1)= fso) exp lipp Œ 24 —X1u)ldfp , (124.4) 


where 
d4p = d3pdpg = dp,.dp,,dp,dpo =dp dpodp3dpq . 


P, is the 4-dimensional momentum vector, and P4 = ipo. The summation 
from 1 to 4 is carried out over the index u. In order not to overload the 
formulae with indices, we shall in what follows omit the index p if this cannot 
lead to misunderstanding. Also expanding 64(2,1) in a Fourier integral 
according to the formula (see Appendix III in Vol. 1) 


84(2,1)= 5. f èpad dtp (124.5) 
us -00 


and substituting expressions (124.4) and (124.5) into (124.2), we find 





(Êz tm) [Sp P&2=x) dåp = fered dtp. (124.6) 


(27)4i 





The action of the operator (V>+m) gives 
(V5+m) elp(¥2—-x1) = (ip +m) eip&2—-x)) x (124.7) 


where p SAri (We stress that in this chapter the mark ^ has a meaning 
| different from that in the preceding chapters of the book.) 
Equating the Fourier components in (124.6) and taking into account 
(124.7), we can write the formal solution for S(p) 


za 


SOE e a a N En E 2.) 
(2m) Í ip +m Qr) (id +m)(ip—m) On) p2 + m2’ 














(124.8) 


where 








Le aa 


520 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


z I 
P2 =D) PuPu Tu = 5O WY PyPy = DO PyPy =P? - 
uw uw u 


We have made use of the anticommutativity of the matrices y. For K(2,1) 
we have correspondingly 


K(2,1)= eaS p-m eip(x2—-x)) d4p . (124.9) 
à (a)t ¥ p2 +m? 


This expression is conveniently written in the form 
K(2,1)=i(¥,—m)I(2,1) , (124.10) 


where /(2,1) is an integral which depends only on ordinary variables but not 
on the Dirac matrices and is equal to 


1 eip&2—-x1) 
1(2,1) = — | ~ d4p . (124.11) 
CD) Cyl ee 


We carry out the integration over the variable Po: 


? -ipolt2—t1) 
Jerta a3p (== ipo $ 
Pes 





1 
1(2,1) = — 
(2,1) (oma 


where Æ, = +(p?+m?). If pg is considered as a certain complex variable, 
then the integration in the plane of this complex variable is carried out over 
the entire real axis. However, the integrand has poles on this axis at the points 
Po =Ep and po =-E,. Consequently, for the integral (124.11) to have a 
definite meaning it is necessary to define the rule of circumventing these 
poles. Feynman proposed the following rule: the left pole is circumvented 
from below, and the right pole from above. To carry this out one has to add 
to the mass m an infinitesimal negative imaginary part which in the final 
result should be made to tend to zero: m > m — iô, ô > 0. 

As a matter of fact, Æ, then also receives an infinitesimal negative 
imaginary part, and correspondingly the poles of the integrand are situated as 
shown in fig. V.30. We can now carry out the integration, closing the con- 
tour of integration with an infinitely large semicircle and calculating residues 
at corresponding poles. Since an exponential function stands within the 
integral sign, the contour of integration is closed below at ty > t} and above 
at f2 <t. Correspondingly for t, >t, the residue is taken at the point 
Po =Ep, and forty <¢, at the point po = —Ep- Thus we obtain 








§124 GREEN’S FUNCTION OF THE DIRAC EQUATION 521 
e _ 
- E e P. 
a e] 
Ep 
Fig. V.30 


i 


I(2,1)= 
1673 





al : . 
Ie exp [ip-(rg—1)iE,,(t9—1, 3p (2>t;), 
p (124.12) 


i l : 
eee | | Pee ced To ea 3 
1(2,1) TE JE; exp [ip-(r3 rı)tiE (t2 t,)|d°p (<t), 


We see that since £, >0, then for ty >¢, only states with a positive 
energy give a contribution, whereas for fy < £f} correspondingly only states 
with a negative energy contribute. We note that the result obtained differs 
substantially from that obtained in non-relativistic theory. Indeed, in non- 
relativistic theory Green’s function was assumed (see (29.3)) to be equal to 
zero for ty < tį. We would also obtain an analogous expression in the relati- 
vistic case if both poles were circumvented from above, which corresponds 
to the replacement po > pg +iô in the integrand. 

A calculation analogous to that which we have just carried out shows that 
for such a replacement, there would correspond to a time fy >t} a.summa- 
tion over positive as well as negative energies, and for ty < t} we would obtain 
I =0. However, from what follows it will be seen that the use of the Green’s 
function proposed by Feynman and defined by formulae (124.12) is much 
more convenient. 

By means of the Green’s function introduced it is possible to construct 
the solution of the Dirac equation, i.e. to obtain a formula analogous to the 
non-relativistic relation (29.2). For this purpose it is simplest to make use 
of Gauss’ theorem in 4-space (d4x=d3xd¢) 


_ 7 OF) 
if oe, P= faae (124.13) 








522 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


where F is an arbitrary 4-vector, S is the surface bounding the given 4- 
dimensional volume, and n(x) is the external normal to this surface at point 
x. Setting 


Fœ) EKE VE’), 


we obtain 


, 


oF, A 
SE EOE y ya’) +K- yy, EO 








ax), ax), ax), 7 
_  dK(x—x’) Oy ou Gooey, UCD) 
-= WV) HKE- )Yy a 


From (124.10) it follows that 
KERI een OK Ex) o era 
ax, nape Sa = V,.K(x-x Ye 


Making use of relations (124.1) and (124.2) we obtain 
OF, (x’) 
ax, 


= —(V,.+m)K(x—x' W(x") = 154 (xx V’). 





Substituting this expression into the 4-dimensional integral written above and 
using (124213) we obtain 


Vx) = — [Kex VEn E doe’) . 
S 


Denoting point x in terms of point 2, and point x’ in terms of point 1, we 
can rewrite the relation obtained in the form 


Y(2)= [KC DuA = [K1 1490) Bx, i (124.14) 
ty ti 


Here two infinite space-like planes ¢=¢, and ¢ = z}, where £} < t3 < tys, are 
chosen as the surface of integration. The integration over time-like surfaces 
can be dropped, since they are as spatially distant from point 2 as one wishes, 
and the function K(2,1), as can be shown, decreases exponentially to zero 
in space-like directions as spatial distances increase indefinitely*. 

The function K(2,1) contains a summation only over states with a positive 
energy, and the function K(2,1 ’), for ty < t;s, only over states with a negative 


* See R.P. Feynman, Phys. Rev. 76 (1949) 749. 


§124 GREEN’S FUNCTION OF THE DIRAC EQUATION $23 


energy. Hence the first integral in (124.14) differs from zero for the com- 
ponents W(1) corresponding to particles with a positive energy, and the 
second integral is correspondingly not equal to zero for the components W(1') 
corresponding to particles with a negative energy. 

We see that the wave function of a particle at point 2 of the 4-dimensional 
space is defined by the Green’s function and by the values of W(1) and W(1'). 
Analogously, setting Je, = y(x" Yy KE —x) it is easy to find the expressions 
for the function Y (2): 


TO= f TONKA Ddy- f TOK, 2x] , (124.15) 
ti> t t< 


where the function Y (x)= yt (x)Y4 and satisfies the equation 


ow = 
aie Yy —my =0 
an 


or 
U(V—m) =0. 


In this form of notation the operator V acts on functions standing on its left. 

The components of the wave function which correspond to negative 
energies Æ are interpreted in Feynman’s theory as the amplitudes of proba- 
bility of finding the particle in the positron state, i.e. in a state with positive 
energy +£ and charge +e. Thus the function (2) is given if the amplitude of 
the electron state vya), at instant of time ¢; < ż2, and the amplitude of the 
positron state W(1') at instant of time ty > t2, are known. 

The phase factor involved in the function K(2,1) for t <t} depends on 
time according to the law 


exp [i£,(¢2—¢, )] = exp [-i£plt2=t; I] 


(see (124.12)). In other words, the time factor of the function K(2,1 
depends on (t,—f,,) in the same way as the phase factor of the wave function 
to particles with a positive energy. In accordance with this, positron states 
can be considered as states of a particle which has a positive energy but 
moves in the opposite direction along the time axis. To this there corresponds 
the fact that the state of a positron must be given at an instant of time 
tį > fy (the second integral in formula (124.14)). 

We now assume that there is an external electromagnetic field. In our 
notation the Dirac equation in this case is written in the form 








524 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 
(V-ieA+m)w = 0, (124.16) 


where A ae the and A4 = iy (vis the scalar potential). 
Green’s function is as usual defined by the equation 


(V,,—ieA +m) KA (x—x') = —i6(x—x') . 


The function K4 (2,1), as well as the function K(2,1), contains in its 
expansion only components corresponding to positive energies for ft, >t] 
and to negative energies for ty < t}. Relations (124.14) and (124.15) remain 
valid, provided that K(2,1) in them is replaced by K4 (2,1). 

Just as in the non-relativistic case, an integral equation satisfied by the 
function K4 (2,1) can be formulated. Namely 


KA(2,1) =K(2,1)—e | K(2,3)AG) AG, 1)d4x; . (124.17) 


The derivation of integral equation (124.17) does not differ from that of 
integral equation (29.17). As we have shown in §58, an equation of such a 
type is conveniently solved by a method of successive approximations (see 
(58.5)). 


§ 125. Green’s function for a system of two particles 


The expression found above for the wave function of one particle must be 
generalized to the case of a system of interacting particles. The simplest 
example of such a system is one consisting of two particles interconnected by 
an interaction of electromagnetic character. We note, first of all, that the 
Green’s function of a system of two non-interacting particles is equal to the 
product of the Green’s functions of each of the particles: 


K(3,4;1,2) =K,(3,1)K, (4,2). (125.1) 


Here K,(3,1) is the Green’s function of the free particle a moving from point 
1 ho point 3. The quantity K,(4,2) for the particle b has an analogous 
meaning. 

In the case of two interacting particles the Green’s function given by 
formula (125.1) can be considered as the zero order approximation with 
respect to the interaction. Let us now find the Green’s function K“)(3,4;1,2) 
in the first approximation with respect to the interaction. That is, we con- 
sider two charged particles which are described by the Dirac equation. In 
order to write the interaction operator it is convenient, following Feynman, 
to consider at first the non-relativistic approximation and then to carry out 








§125 GREEN’S FUNCTION FOR TWO-PARTICLE SYSTEM 525 


the corresponding generalization to the case of relativistic particles. In the 
non-relativistic approximation the interaction between particles is described 
by the Coulomb law, and the function K)(3,4;1,2) by analogy with formula 
(58.3) is defined by the relation 


K(3,4:1,2) = ie? f K,(3,5)Kiy (4,6) == 8(¢56 )Ko(5o1)Ky(6,2) dx xg 
(125.2) 


where rse =|f5—rel. The meaning of the expression for KD)(3,4;1,2) is 
easily understood if it is compared with a diagram (fig. V.31) which is inter- 
preted as follows. Particle a moves from point 1 to point 3, passing through 
the intermediate point 5. The line 2—6—4 describes the motion of particle b. 
To the line 1—5 of the diagram there corresponds the function of motion 
K,(5,1), and to the line 2—6 the function K,(6,2). The interaction between 
the particles takes place at points 5 and 6. The dotted line corresponds to the 
expression (e?/r56)5(ts6)> where rsg = | 5—Ig| is the spatial distance between 
points 5 and 6, and fse = f5 — tg, where ts and żę are the instants of time at 
which particles a and b arrive at points 5 and 6. The 6-function of the time 
argument means that in the non-relativistic approximation one has to dis- 
regard the time lag and to consider particles at points 5 and 6 at one and the 
same instant of time £= ts = tę- The lines 5—3 and 6—4 correspond to the 
motion of free particles after the interaction (the functions K,(3,5) and 
Kp (4,6) in (125.2)). 


Fig. V.31 


The generalization of expression (125.2) to the relativistic case involves 
first of all taking into account the interaction lag. At first sight it may seem 
that 6(¢5¢) should be replaced by 5(¢5¢—rsg), where rsg defines the time lag 
(in our notation the velocity of propagation of interaction c = 1). However, 
such a replacement would be incorrect. Indeed, the electromagnetic interac- 
tion represents the exchange of photons which have a positive energy. 





526 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


However, the expansion of the -function in a Fourier integral contains 
positive as well as negative frequencies. Hence for the transition from the 
Coulomb interaction to the relativistic generalization taking into account the 
time lag, the 6-function should be replaced by the function 6, defined by the 
relation 


E oe = : li 
54(x) = lim T AAS oer (125.3) 





The analogue of the ô-function defined in such a way contains the ex- 
pansion only in terms of positive frequencies. Since t5 takes on positive as 
well as negative values, one takes the symmetrized combination 


] 
i (84.(056-756)#54(ts6-756)] = ô+(126-736) = ô+ (x36) - 


This equality is immediately found by means of formula (125.3). 

Furthermore, it is necessary to take into account that if the particles are 
moving, then in addition to the Coulomb interaction there is an electromag- 
netic interaction (see (25.27) of Part II). This leads to the fact that the 
interaction is defined by the expression (1—V 5> Veer ô (=x Be): 

We shall obtain the interaction operator if we enlace the velocity vectors 
Vs and V¢ by the operators a, and a,; each of the operators acts respectively 
on the variables of particle a and particle b. (Indeed, the velocity operator 
can easily be found by the formula v= (A,r). But H (see (113.7)) is equal to 
A=i7} aV+6m, and, commuting, we obtain (A, r] =a.) Then, by virtue of 
(113.14), one can write for the interaction operator in the relativistic case 


ey. 2 
(1c ay) €754(—x36) = €” Babb Yau Vou5+ (x56) - 

To obtain the final expression for the function K“)(3,4;1,2) we have to 
establish the connection between relativistic and non-relativistic Green’s 
functions. For this we compare formulae (29.3) and (124.14). We see that 
the following correspondence holds: 

K nonzel ” K relB 2 
Thus for particles by the Dirac equation we have 


KD (3,4;1,2)8,B, = —ie? ['K (3,5) Ky (4,6) Vau You 556) X 


X K,(5, 1) Kp (6,2)8,B,d4x5d4x¢ - (125.4) 
Multiplying (125.4) from the left by 6 bp, we find finally 


aE o a 


§126 FEYNMAN DIAGRAMS 527 


KOG,4;1,2)= —ie? | K (3,5)Kp(4,6) Yau Ybu + (x26) X 


X Kq(5,1) Kp (6,2)d4x5d4x6 = (125.5) 


=e? [K(3,5)K (4,6) Yau D(-X36) You Ka(5,1)Kp (6,2)d4x5d4x6 : 


The function D(—x2¢) = —iô (x26) is usually called the propagation 
function of the virtual photon. Thus we see that, taking into account rela- 
tivistic effects, the diagram in fig. V.31 can be interpreted as follows. The 
functions K correspond to solid (electron) lines, the function D corresponds 
to the dotted line, and to the vertices there correspond the matrices CYay and 
CY bu: 

All calculations in Feynman’s theory are substantially simplified if they 
are carried out in the momentum representation. The form of the function K 
in the p-representation is given by formula (124.8). There remains to be 
determined the Fourier component of the function 6,. We shall show that 
the following relation holds: 


1 eikx 
473 ~“ k? —ie 
where e is an infinitesimal quantity. It defines the rule for circumventing 


poles. We can convince ourselves of the validity of relation (125.6) by 
calculating the integral on the right directly: 


ikx y ss -ikox 
I g d4k = feik-ra3x f L dko- 
k? —ie keka mie 


ô (=x?) = dtk, (125.6) 





We assume, for example, that xg >0. Closing the contour of integration 
below and finding the residues, we obtain 


eik-r + 
i | — e-iklxol q3 
sill aC old°k. 
Writing d3k in the form k2dk dQ, integrating over angles and taking into 
account (125.3) we obtain the relation sought. 
§ 126. Feynman diagrams 
We shall now consider the rules for calculation of the probabilities of 


transition from one state into another by means of the mathematical appa- 
ratus presented in the preceding sections. For simplicity we shall first con- 





528 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


sider one particle (for example, an electron) which makes a transition from 
one state into another under the action of an external electromagnetic field. 
Let the electron at the initial instant of time =r, be in the state W(r,,t,) = 
y(1), and at the instant of time r= let it be in the state W(r,,t.) = (2) 
corresponding to a positive energy. The probability of transition into a 
particular state W,,(r>,t2) is, as always, defined by the square of the modulus 
of the corresponding amplitude of the expansion of the function W(rj,r>) in 
terms of the function y,,(rz,t2) 

M= [Vi (ro.ty W209) d3xy - (126.1) 

Expressing the function Y(r3,t2) in terms of Green’s function according 

to formula (124.14), we obtain 

M= [Wh (ro,t)KA (1,25 11 51) BYE 11) 3x 43x À 
In place of the Green’s function K4 we can write its expansion in a series of 
successive approximations. We then obtain an expression for the transition 
amplitude M in the form of a perturbation theory series. Thus, for example, 
the transition amplitude in the first approximation of perturbation theory is 
equal to 


MD =e fv} (2)K(2,3)A(3)K(3,1) BY) 43x d3xzdtx3 . (126.2) 
This expression can be written in a more compact form, using the relations 
YGB) =[KG,1)BY(1)a3x, , 
Yn(3) =f G,,(2)BK(2,3) dx, - 
Then for the transition amplitude we have 
MO) = e [Y,(3)A (BW) d4x, . (126.3) 
It is easy to obtain, in an analogous way, the second approximation of the 
transition amplitude 
MO = (e)? [ ¥,(3)A(3)KB,A)A(4W(4)d4xzd4x4. (126.4) 


If the initial and final states are described by plane waves, then formulae 
(126.3) and (126.4) are conveniently rewritten in the momentum representa- 
tion. Setting 


Vn (3) = U(pa)eP2*3 , 
YG) =u(p,)elP%s 


and using the Fourier representation of the operator A 











§ 126 FEYNMAN DIAGRAMS 529 


A(3) = f ak) eik*3 atk , (126.5) 
we obtain for the transition amplitude of the first order (126.3) 
MO = —¢ fatx, fe-ipzxstikxstipixsī(p,)â(k)u(pı)dtk= 
= -e(2n)* [u(p2)a(k)54 (k+p —p2)u(P))d4k = 
= —e(27)*a(p>)@(p2.—P;)u(P)) - , (126.6) 
For the transition amplitude of the second order we have, correspondingly, 


M®) = (—e)? ff 2(p>)4(k,)S(p)a(k)u(p, )d4pd4kd*k, X 
x Joa aa d4x,dtx4 = 
= e2 (27) [UP2)â(k1)54 (ky +P—P2) S(D)A(K)X 
X ôt (p; +k—p)u(p,) d4pd4kd4k, = 
= e? (27)? [TP2)â(p2-p1-K)X 

1 
———————(k) u(p, )d*k. 126.7 
ems (iG, +k) +m] ety (aie 

Formulae (126.6) and (126.7) can be associated with pictorial representations, 
called Feynman diagrams. As will be shown below, to each line and to each 
crossing of lines (called a vertex) in the Feynman diagram there corresponds a 
definite factor in the transition amplitude. In the case of complex processes 
such diagrams make it possible to simplify the construction of expressions for 
the transition amplitudes. In a Feynman diagram we represent the states of 
electrons and positrons by solid lines, and the states of the electromagnetic 
field by dotted lines. The arrows on the lines show the order of writing the 
terms of the transition amplitude. To an increase in time there corresponds 
the motion of the particle from the right to the left. 

Let us consider the simplest Feynman diagram (fig. V.32) corresponding 
to the following process: an electron with momentum py, was scattered by an 
external electromagnetic field and made a transition into a new state with 
momentum p3. The probability amplitude of this transition is given by 
formula (126.6). In fig. V.32 the free electron with momentum p} is repre- 
sented by the solid line AB. This straight line corresponds to the first factor 
in the transition amplitude M@) (the factors are numbered from the right to 
the left), the bispinor u(p,). At point B the electron is scattered by the electro- 
magnetic field represented by the dotted line. The crossing of the solid and 
dotted lines (the vertex) in the Feynman diagram corresponds to the operator 





530 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 





1a(p,- P,) lalpa -p71 alk) 
B C 
p: Py 
D A 
Fig. V.32 Fig. V.33 


—eâ(k) in the transition amplitude MW) multiplied by the -function of the 
momenta of all three particles. The electron with momentum p> is repre- 
sented by the straight line BD. In the amplitude M)) the bispinor a(p>) corre- 
sponds to it. 

We see that the order of the process with respect to the charge e is defined 
by the number of vertices in the Feynman diagram. This is particularly 
clearly seen from the consideration of the Feynman diagram for a process of 
the second order (fig. V.33). This diagram corresponds to the process of 
electron scattering in the second approximation of perturbation theory. The 
line AB (called the external line) represents the motion of the free electron. 
To it there corresponds the bispinor u(p,) in the transition amplitude M@), 
The scattering of the electron takes place at vertex B. In the transition 
amplitude M@) there corresponds to vertex B the factor —e@(k) and the 
momentum 6-function 54 (p; +k—p). The line BC joining the two vertices is 
called the internal line. To it there corresponds in M®) the factor S, the 
Fourier component of the Green’s function defined by formula (124.8). The 
external field acts at vertex C. To the vertex C there corresponds in M® the 
operator —e@(k,) and the 6-function 54(k, tp—pp). The external line CD 
represents the motion of the electron with momentum p3. To the line CD in 
M®) there corresponds the bispinor (p35). Since the law of conservation of 
4-momentum in the transition from state p} into state po is fulfilled for 
arbitrary values of the wave vector k, the integration is carried out over the 
vector k (or kı). The value of the numerical factor in the expression for the 
amplitude is determined by the number of 6-functions involved in it, which is 
equal to the number of vertices. Each vertex brings into M@) a factor of 
(2n)4. 

The Feynman diagram for processes involving positrons can be constructed 
in exactly the same way. For example, the diagram in fig. V.33 also describes 





§ 126 FEYNMAN DIAGRAMS 531 


positron scattering in the second approximation of perturbation theory. Since 
in Feynman’s theory the positron is considered as an electron moving back- 
ward in time, this diagram defines the transition amplitude of the positron 
from a state with momentum —p, into a state with momentum —p,. For 
the quantity M(® in this case we have 


l Hik eee 
MO) =- e2(2n)4 Jopa) BSE OO). (126.8) 


Here v is the Dirac bispinor corresponding to a state with a negative energy. 

The relations derived make it possible to consider processes associated 
with the emission and absorption of free electrons in addition to scattering 
processes in an external electromagnetic field. For this the operator A in the 
general formulae (126.3) and (126.4) must correspond to the pole of one 
emitted or one absorbed photon. In accordance with formulae (102.3) and 
(102.5), there corresponds to the pole of the emitted photon the vector 
potential 


Aj, = (2n/w)? epe , (126.9) 
and to the absorbed photon 
AV (2n/w)? e ei . (126.10) 


Here ey is the polarization vector, and k denotes a four-dimensional wave 
vector. Since only one photon is absorbed or emitted, the matrix elements of 
the operators @ and @t are equal to one. The diagrams shown in fig. V.34 
describe processes involving two free photons. Thus, for example, these 
diagrams describe the process of Compton scattering, i.e. the scattering of a 
photon by a free electron. In this process the photon before scattering had 
wave vector kj, and after scattering ky. In the diagram shown in fig. V.34a 














532 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


this corresponds to the fact that the photon is first absorbed and then emitted 
by the electron. The diagram shown in fig. V.34b also corresponds to the 
same process, where first the photon K, is emitted and then the photon k} is 
absorbed. Of course, the words ‘first’ and ‘then’ refer only to the order of 
writing the factors in the transition amplitude and have no other, physical 
meaning. 

The Feynman diagrams shown in fig. V.34 make it possible to write 
immediately the transition amplitude without having to carry out each time 
special calculations of the type carried out in deriving formulae (126.6) and 
(126.7). The total transition amplitude is defined by the sum of the ampli- 
tudes corresponding to diagrams V.34a and V.34b. 

It has the form 


2 3 Qn \r 
aie 4 | 7 27 mins see 47 \? ~ 
M i (27) [=2) E ĉz iG +k) +m (=) éu(p,) + 
5 (126.11) 


e 20ER 1 Qn \? . 4 J 
+ up.) (2) êi E ml) esu(pi |s (Pp, +k,—P2-k), 


where ĉ; FO mo ê, = Cou Yp? and e, and e, are the polarization vectors of 
the photon before and after scattering. 

The appearance of the 6-function is easily understood if it is taken into 
account that in the given case expressions of the type (126.9) and (126.10), 
containing no integration over k, are substituted for the external field 
operator of the form of (126.5) in the expression of the type (126.4). Thus 
in an expression of the type (126.7) there will be no integration over k and 
k,. After the integration over p there will remain one 5-function, expressing 
the law of conservation of energy and momentum in the Compton process. 

Diagrams of such a type describe, for example, the process of annihilation 
of an electron with momentum p, and of a positron with momentum —Pp2- 
Two photons with momenta k, and k, are produced in the annihilation 
(fig. V.35). The amplitude of the two-photon annihilation of the pair, 
according to the same rules, is written in the form 
Meleag 1 

=~ e*(27) ao 
: (w 02)? 


= y 1 Aala l A 
x [o (A i@, ky) +m e iP Ep) met) uip1)| X 
X ôf; -p2—kı—=k2) . (126.12) 





As another example let us consider the electron bremsstrahlung, i.e. the 


§126 FEYNMAN DIAGRAMS 533 


l l I 
[l I l 
| l | 
| l | 
\ l 1 


P,=-P, P “P,=-p, P, 
(a) (b) 
Fig. V.35 
jtm pa b | 
| 1 l i 
a(p,—p,+k)! lk k! lalp+k-p) 





Fig. V.36 


radiation arising when a fast electron traverses the field of a nucleus. The 
diagram corresponding to this process is shown in fig. V.36. An electron 
with momentum p} scattered by an external field a emits a photon of 
momentum k and polarization e and makes a transition into a state with 
momentum p>. In this case two processes, shown in the diagrams of fig. V.36a 
and b, are possible. The total transition amplitude, in accordance with the 
rules presented, is of the form 


Ma oe ū(p2) [apa =v1 +k) D 





175) +m © X 
l 
5 a $ 5 
+ eo Gt yas pi)] u(p;). (126.13) 


Let us now consider processes associated with the interaction of two 
particles, for example the scattering of an electron by a u-meson. The transi- 
tion amplitude is again calculated by expanding the wave function of the 
system in terms of the products of the wave functions of the free particles. 
The wave function of a system of two particles is defined by the Green’s 





534 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


function (125.5). These calculations lead to graphs which are plotted accord- 
ing to the same principles as for one particle. In the first approximation of 
perturbation theory there corresponds to the scattering process the Feynman 
diagram shown in fig. V.37. 

Here the solid lines AB and CD correspond to the motion of free particles 
with momenta p, and p>. The electromagnetic interaction between particles 
amounts to photon exchange. A virtual photon, which is emitted by the 
second particle (vertex D), is absorbed at vertex B. Lines BE and DF corre- 
spond to the motion of the particles after the interaction with momenta P3 
and p4. The transition amplitude is constructed according to the usual rules 
and has the form 


M= e? (2)8i1, (P3) (P4) Yay PP 1 -P3) X 


X Youtal P1 Up P2)84@3+P4-P1-P2) - (126.14) 


Here Dy is the Fourier component of the propagation function of the virtual 
photon D, which, according to (125.5) and (125.6), is given by the formula 
i N 
Di(k)=- — >. (126.15) 
f 4r? k2 
Let us dwell briefly on corrections which arise in higher approximations of 
perturbation theory. The diagrams corresponding to these corrections must, 


E A 
Boe P, 
"Yop 
! 

I 
Yop 
D 

2 A 
F c b 





Fig. V.37. 


§127 THE COMPTON EFFECT 535 


naturally, contain a larger number of vertices in comparison with the corre- 
sponding basic diagram (to each vertex there corresponds the smallness 
parameter e). Namely, it is the number of vertices that determines the order 
of smallness of the correction considered. 

Let us consider, for example, diagrams corresponding to the next approxi- 
mation in the theory of the Compton effect. As is easily understood, to the - 
diagram in fig. V.34a there correspond the corrections in fig. V.38, and 
analogously for the diagram in fig. V.34b. All these diagrams differ from the 
initial diagram (fig. V.34) by the presence of an internal photon line. This 
line corresponds, as we have already seen, to the emission and absorption of a 
virtual quantum. Hence these corrections are usually called radiative correc- 
tions. The calculation of these corrections involves particular difficulties and 
requires the so-called renormalization method. We shall not dwell on these 
problems*. 


§ 127. The Compton effect 


To illustrate the technique of calculation of cross sections in Feynntan’s 
theory we shall consider the theory of the Compton effect in more detail. The 
transition amplitude for this process has already been obtained by means of a 
Feynman diagram (fig. V.34) in the form of (126.11). Separating the 5-func- 
tion, we write the transition amplitude in the form 


M = M,64(p, +k, -p2—k2), (127.1) 
where 
ie2(2n) ip, tk,)—m ip, -ky)—m 
ie“(Q7/jp _ a 1°™1 A ats ie) A 
mie ee alpa) |e, Ea 42, S E 5 | u(Pi)- 
(ww) (B,+k,)* +m (P-k) +m 


The probability of the Compton effect is given by the formula 
Paj = Mp1? (641 +k, —p2—kp))? - (127.2) 


In order to eliminate the square of the 6-function it is convenient to make use 
of the definition of the 4-dimensional 5-function 


= all ipx 
OO aed d4x. 





* See, for example, A.I.Akhiezer and V.B.Berestetskii, Quantum electrodynamics 
(Interscience Publ., New York, 1965). 








536 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


At the point p = 0, which, as is seen from (127.2), is the only one playing 
an important role, the 4-dimensional integral is equal to V7/(27)*, where V 
is the normalization volume, and T is the duration of the process. Choosing V 
to be unity, we have for the transition probability P, per unit time 


Ley 1 
—— [M71254 (p; tk;-p2—k2) - (127.3) 


P>, == = 
ZBL pA (2n)4 


The final state of the system is defined by the momentum of the electron py 
and of the scattered photon ky. The number of final states in the interval of 
momenta dp, and dk, is given by the usual relation dp dk,/(27)®. The 
probability of transition into the interval of final states dpjdk, is written in 
the form 
dp dk 
1 P20K2 
dW, =—— |M,1264(p, tk, —P2-k2) ——— - (127.4) 
21 (Qn) 21 nei Qn) 
The 4-dimensional 5-function expresses the law of conservation of energy 
and momentum 


py tky =p2 +k - (127.5) 


From this relation it is easy to define the frequency of the scattered 
photon as a function of the scattering angle 0, i.e. the angle between the 
vectors k, and k3 


p? +k? +2p;k; = p3 +k3 +2pok, 


but 


D2 
—m? =P2> 


=N 
Il 

=n 
ll 


pi =pi -Ei 


k? =k? — wi =k =0 


and, consequently, pk; =p 2k . Making use of (127.5) we have 


=N 
I 


Piki =Pyko tkyky - 
We assume for simplicity that the electron was initially at rest: p, = 0, 
E, =m. After a simple calculation (see (17.11) of Part II) we find 
ii 
O ee (127.6) 
1 +m lw (1— cos ô) 
Integrating (127.4) over the three momentum components we obtain 


§127 THE COMPTON EFFECT 537 


1 
(27)!0 





dwz; = (M5, |25 (m+; —E.—w wh dwd. (127.7) 


We integrate this expression over frequency w3. It should then be recalled 
that the energy Æ, is also a function of w7, 


E= |m2+p3]? = |m2+(k,-k3)?}? = |n2 +e} tw -2w w cos ð]? . 
We introduce the new variable y = Ey + w 


1 


<2 





dW = —— (Mp, (26 (m+, yew dydQ. (127.8) 
(27)! * ~ 


Integrating with respect to y we have 


1 2 1 
dW; =—— |M,, /2ws —— dQ , (127.9) 
al (27)!0 l ? dy/dw w2tE2=mtw} 
but 
ay aE tw) ] 
= Fl te — z ; 
p T 1 E (w2—w, cos?) (127.10) 


Since the value of w, satisfying the energy conservation law is used, then 
substituting (127.6) into (127.10), we find for the transition probability 
(127.9) 

3 
a e 5 
Wo, = (onyl0 may M7114 dQ. (127.11) 

We shall obtain the cross section for the process by dividing the transition 
probability by the incident photon flux density. For a normalization such 
that there is 1 photon in volume V = 1 the flux density is numerically equal 
to the velocity of light c. In the system of units that we have chosen c = 1 and 
the cross section turns out to be numerically equal to dW, . We also set (see 
(127.1)) 








(27)5 b: $ 
2m(w w)? (a(p2)Qu(p,)) , 


Ma; = ie2 
where 
i(6,+k,)—m ip, -k.)—m 
o= [a L ENE voyi sae IE E 2,| 2m . 
(By tk, )? +m? (Bk)? + m2 
The expression for the operator Q can be somewhat simplified if it is 
taken into account that 








538 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


A A 

+D? +m? =p? +k? +2p;k; +m? = 2p;k; =—2wm . 
Analogously for the denominator of the second fraction one can write 

Ê; -Ê +m? =p? +k3 —2p,k2 +m? = —2p; k =2w 2m. 


Then for Ô we have 
Aa i(p,-k)—m | x ip, +k,)—m | 
ey é 








ô= E sues T a (127.12) 
Correspondingly the differential cross section is equal to 
T2 w5E = ^A 2 
do = 479 5 (p2) Qulp) dQ , (127.13) 


moo 
where ro = e2/m is the classical radius of the electron. 

The expression obtained describes a process in which the electron and 
photon in the initial and final states have definite polarizations. If the elec- 
trons in the initial state are not polarized and we are not interested in the 
polarization in the final state, then the cross section must be averaged over 
the spin states of the electron in the initial state and summed over the final 
spin states. Consequently we have to determine the quantity 


d=5 DD bh ly, (D2 )Ou,g, (P1)? d= 
z 


02 9% m 1 
1 2 OE 2 A + At 
= may = (uh, BOu,, us, O" Bug,) » (127.14) 
mo 
i 01,02 


where ug, and ug, are states with a definite polarizations. For example, 
(SP )Mo, = 02 |P2l407> ie. G2 is the spin component along the direction of 
motion, 0, = +4. The operator Sis defined in §117. 

The sum over g} involved in (127.14) is conveniently rewritten in the form 


D (uh BOu,, Xu} Ot pu,,)= 27 (021610; (0 10" Blo) . (127.15) 


o1 oi 


The summation is carried out over two spin states o} with a positive 
energy. If the summation were also carried out over states with a negative 
energy, i.e. over all possible states (for given momentum), then expression 
(127.15) would be considerably simplified and would represent the matrix 


§127 THE COMPTON EFFECT 539 


element (oz IBÔÔ' Bio) (in correspondence with the rule of multiplication 
of matrices (45.6)). 

In order to extend the summation to all four intermediate states, the 
following method is used in calculations of such a type: the auxiliary 
operator R, called the projection operator, is introduced 

~ H+\B| a-p+hm + |e 


a= 2E 2\E| 





(127.16) 
The action of this operator on the functions u and v which correspond to 
positive and negative energies is defined by the equations 


BCIE Ê +E] 
ZIE| d ZEI 





v0 (127.17) 
Replacing the function ug, in (127.15) by the function 
ap, + Bm +IEl 
Die Ug, 
we can formally extend the summation also to states with a negative energy, 
since in the presence of the projection operator, in correspondence with 


(127.17), they give no contribution to the result. 
In place of the matrices a, and B we introduce the matrices y, 


png, US S)s BEYA. 


Substituting the matrices y, into (127.16) we obtain 


A ee LOR ex 

R=- JE (i¥,,Py-") Y4 = — zg Pm) 4 7 (127.18) 
The sum (127.15) can be rewritten in the form 
2 (ub OR uo, Yul Ol yqu,,) = u$ Y4ÔR Ol y4u,,) z (127.19) 
01,6, 


Further, we calculate the sum over the components of the spin in the final 
state of the electron 0. Here it is also convenient to pass to the summation 
over all four states, introducing the projection operator R, 


S= X u} vgOR Ot rgRru,,)- (127.20) 


02,E2 


The summation is carried out over states with a positive energy as well as 








540 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


over states with a negative energy. We see that expression (127.20) represents 
the sum of diagonal matrix elements, i.e. 


S= Tr (74OR Ot y4R>) = Tr (OR Ot vgRo74), (127.21) 


since a cyclic permutation of matrices can be carried out under the sign Tr, 

as can easily be checked directly. Substituting this expression into (127.14) 

and taking into account that OF =], fı =0, £, =m, we obtain 

a a a 
Tr [OGD —m) 4 OT Y4 (ib .—m)] - (127.22) 





= St 
do = 3575 2 
m wy 


For any operator of the form of A, the fourth component of which is 
imaginary while three components are real, the following equality holds: 


ygAt yy =—A (127.23) 
For the product of operators we have correspondingly 
ygAt Bt Ciya = gt yaya bt Y41404 = (-AN-B\-O) . (127.24) 


Then the expression y4Ôt Y4 is rewritten in the form 
Y4 {2} [i@, +k, yi +més - w3} [iP -k,)t +m lel} y4 = 
= {w7 ê] [-i8, +h )+m]ê + w312 (iG, -kp)-m]éy} - (127.25) 


We now sum the cross section over the final states of the photon and 
average over the initial states. In order to calculate the cross section for the 
case where the incident photon is polarized along axis | and the scattered 
photon along axis 2, one has to substitute the values ê} =y; and @) = y, in 
the expression (127.12) for 0. Since, however, the incident photons are not 
polarized and we are not interested in the polarization of the scattered 
photons, we have to substitute y, for êj and 7, for é, and to sum over all 
values of the indices v and u. We then have 

row 
do = D 
SUMS Os (127.26) 


X Tr {L03 7, GB -ky )-m) y,- 7 YGER) -my Gm) X 
X [03 u GP —Rp)—m) 7, -w7! 7,4 +R- 7, ipm) 


The summation is carried out over the twice repeated indices y and v. 
Although a free photon can be polarized in two directions perpendicular 

to the direction of motion, the summation over u and v can actually be 

carried out over all four values. This is associated with the fact that there are 





§127 THE COMPTON EFFECT 341 


no real photons polarized in the direction of motion and along the time axis, 
and taking them into account formally does not change the final result*. 


Relation (127.26) is convenienfly rewritten in the form 
2 2 


rpw > 
hej Ss (HN MALY) (127.27) 
32m* wy ~ 


where 
Fy = (w3!7,09—m) Yu -wT Yu Gi Y, MB, =m) X 
X [Y 03 (iG2-m)y, iP 2-m) , (127.28) 
qq, =P, +k,, 42 FP- k2. 
The expression for F is obtained from F} by the replacement 
IT ALG REY = PCD i ALL Mi) ee CFA LED 


For further calculations it is convenient to use Feynman’s formulae, which 
are easily checked by direct calculation: 


YoYy ae 
WAY, =-24, 
wâl = 4(4\A2), 
wAArAyy, = 24 ,A,A, . 


(127.29) 


Making use of these expressions it is easy to carry out the summation over 
wand v in (127.27). 
Thus setting F, = F} + Fy we find 


TrF, = Tr ly,w3! (iG —m) Y, GP —m)y,03' (if.—m) Y, (iP —m)] = 
= Tr [w3 (i9.—m) 7, GP mY ©3' (iG 2—m) 7, (B2-) Yy | = 
= 4035? Tr [(iGy—m\(iP| +2m)\(iG2—m (iB +2m)) , (127.30) 
Tr Fy =—Tr [rwr (iG —m)y, P mM) Yy w3 (iG. —m) 7, (iP>—m)] = 
=— 4w! w7! Tr {[2i(q4q2 P +mG ap | —imGy (ig, —m)— 
— imp (ig ;—m) +m? (ig, +2m)|(iP.—m)} - (127.31) 


In calculating the traces it is necessary to use the following rules: 


* See, for example, A.I.Akhiezer and V.B.Berestetskil, Quantum electrodynamics 
(Interscience Publ., New York, 1965). 


mil 








$42 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 
(1) the trace of the product of an odd number of vectors A is equal to zero; 
(2) the trace of a scalar quantity is equal to its quadrupoled value; 
(3) TrA,A> =4(4142); 
(4) Trå Â ÂÂ, = 4[(4 1A 7)(A 544 )H(4 Ag (4243) A314 24) 
(127.32) 

The first two rules are trivial. Rules (3) and (4) are easily proved, making use 
of the identity 

ÂA + AA, = 2(A,A))- (127.33) 


If we take the trace of the left- hand and right- -hand sides and carry out the 
cyclic permutation of the vectors A, and A, under the sign € of trace, then we 
immediately obtain rule (3). We find the trace of A NADA: ae Making use of 
the identity (127.33) we have 


TrA,4,4,A, =—TrA,A,AgA3 +2 TrA A (A344) = 
= TrÂ A444; —2 Tr (4244) A; + 8(AA2)(A3A4) = 
=—-TrA,A, 4,4, +2 Tr42A3(A Ag) —8(4 24 )(A1A3) + 8(4 A 4344) = 
=—Tr4,4,4,A4 +8(49A3)(A1Aq)—8(47Aq MAA 3) + 8(4 4 7)(4 344). 
From this equality we obtain rule (4). 
By means of the above rules we find 
TrF! = w 2 pea +4m2 à 
rF] [2@ oP a2p2)—(45 tM Np p2)+4m" (qo?) 
%2 
T E E ], 
T = wey 24142) Pa)? (aap)? (Q241)? a2P2) + 
E AONE " (127.34) 


Carrying out the replacement mentioned, we obtain from these expressions 
Tr F3. 

Performing the necessary transformations, after several long but simple 
calculations, we arrive at the well-known Klein—Nishina formula 


“wa \2 [w 
do =473(—2) = +22 ~sin? 9) dQ, (127.35) 
2 


which plays an important role in applications. 


§128 THE LAMB SHIFT 543 


For low photon energies w] <m, w =w; (see (127.6)) and formula 
(127.35) in the limit reduces to the classical Thomson formula 


do = r (1tcos? 9)dQ , 


obtained in §36 of Part I. 


§128. The shift of the terms of the hydrogen atom under the action of the 
vacuum field (the Lamb shift) 


The importance of Feynman’s method of calculation does not, of course, 
reduce to a simplification and standardization of calculations. 

As we have already pointed out in §124, the Feynman formalism made it 
possible to obtain in an obvious form the solution of a number of important 
problems of quantum electrodynamics. They include, in particular, the Lamb 
shift of atomic terms already mentioned. 

The phenomenon of the Lamb shift yields a very obvious illustration of 
the validity of those concepts which were assumed as the basis of the quan- 
tum theory of radiation and the theory of the positron. In the quantum 
theory of radiation it was assumed that in a vacuum there is an electromag- 
netic field. This is the field which corresponds to the zero-point oscillations 
of field oscillators. The set of electromagnetic field oscillators in states with 
zero energy is often said to represent the ‘electromagnetic vacuum’. In the 
electromagnetic vacuum, corresponding to the field state with the lowest 
energy, there is a field strength different from zero. More precisely, the 
mean (over time) values of the squares of field strengths (€)? and (X)? are 
different from zero. 

The existence of the vacuum field had no effect on the phenomena of 
emission, absorption and scattering which were considered in ch. 12. All 
these phenomena were associated with the transitions of field oscillators 
from non-excited (zero) states into excited states and vice versa. Hence in 
the course of a number of years the properties of the electromagnetic vacuum 
were not related to directly observed phenomena. 

In the theory of positrons it is assumed that in addition to the electro- 
magnetic vacuum there is an electron—positron vacuum, or a background of 
occupied states with negative energies, which was considered in detail in 
§116. It turns out that the existence of the electromagnetic and electron— 
positron vacua is shown directly not only in processes occurring at large 
energies (for example, in the Compton effect or in the process of pair produc- 
tion) but also in features of the behaviour of particles at small energies, in 





544 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


particular in the phenomenon of the Lamb shift. The phenomenon of the 
Lamb shift can be studied rigorously by means of the Feynman formalism. 
It turns out, however, that this effect can also be discussed without using a 
relatively complicated mathematical apparatus, on the basis of simple and 
direct considerations*. 

For this we shall first of all discuss the problem as to what magnitude 
the mean square value of the field strength can have at an arbitrary point of 
vacuum. 

To calculate the mean square value of the field strength in vacuum we 
consider the normalization volume Vg. The zero-point oscillation frequency 
w has energy 4ħw. One can write the obvious equality 





2 
myl 2 2 sl 2 _ €o.Yo s 
zho = g (€9.,+ Ho.) dV = 7 SE}. av = Sn” (128.1) 


where Cow and Xou are the field strength amplitudes of the field in vacuum 
corresponding to zero-point oscillations with frequency w; the bar denotes 
averaging over the oscillation period. Equality (128.1) makes it possible to 
find the mean square value of the amplitude of the zero-point oscillations of 
the field with frequency w: 





ch =e. (128.2) 
0 

Let us consider the electron in the hydrogen atom. This electron is acted 
upon by the Coulomb field of the nucleus and by the fluctuations of the 
zero point field of the vacuum. Hence a random motion under the action of 
the vacuum field will be superposed on the orbital motion of the electron. 
Let U(r) denote the potential energy of the electron at point r. We now 
assume that the coordinate of the electron can be written as r=ry +r’, 
where rg is the ordinary value of the coordinate of the electron smoothly 
varying in its orbital motion, and r’ is its small displacement under the action 
of a random force, the fluctuating field. Then the change in the mean 
potential energy of the electron undergoing random displacements can be 

written in the form 


; 9 Oy NONE 
(AU) = U(r9 tt’) — U(rg)) © i ax; + 3 (xjx%) ie) 


=4.V2U(x})2) =4(V2U\(r')) . (128.3) 


* See T.Welton, Phys. Rev. 74 (1948) 1157. 


§128 THE LAMB SHIFT 545 


Here the bracket () denotes the mean over all possible values of the random 
quantity r’. In averaging we have takne into account that (x;) = 0 and that by 
virtue of the spatial isotropy of random displacements 


Oixe = ir). 
The value of the potential energy in the factor V?U is evidently taken at the 
value r = rọ. 

The potential energy of the electron in the atom without the perturbation 
caused by the vacuum field does not depend on the state of the vacuum field 
and the sign in it is dropped. 

In the Coulomb field of the proton one can write for V?U(ro) 


V?U(ro) = 4ne75(r9), 
so that 
(U) = U(rg) + 371e7S(rg Xr’) . (128.4) 


To obtain the shift of the atomic term we have to take the mean value of 
(128.4) over the state of the electron in the atom. We then have 


AE amb = (U) — Uro) = (AU) = 37? [don (to)I24'2) dV 5 (128.5) 


where w,, is the wave function of the electron in the atom. Making use of the 
properties of the 6-function, we find 


AE Lamo = (AU) = Zne? lyp (0)? 2). (128.6) 


The calculation of &'2), the mean square displacement of the electron under 
the action of the zero-point oscillations of the field, can be carried out 
relatively simply if only relatively low oscillation frequencies of the field are 
taken into account. 

We shall assume that the displacement of the electron under the action of 
the field proceeds independently of the orbital motion. Neglecting relativistic 
effects, one can write the equations of motion in the form 

pq) 


ws 


m ya =ceE,,, sin(k-r—wt) , 





whence 


e 
= Ow . 
lo == 5 sin (K-r—wf) 
mw 





and, correspondingly, 





546 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


€ To= ek, neen (128.7) 
moot m*wVo 

where the bar denotes average over time. Here Ty denotes the displacement 
under the action of the zero-point oscillations of the field with frequency w. 
Since zero-point oscillations with different frequencies are independent, 
their contribution to the total mean square displacement of the electron is 
found by simple summation. Consequently, we can write for the total mean 

square displacement the expression 


w? dwV, 2 W max 
(r’2 2 0_ 2e*h dw k 
NE =A 2,3 ; 5 of can (128.7') 
@ min 


ne-m 





where the integration is carried out over all possible frequencies of zero-point 
oscillations. 

If there were no electron—positron vacuum, then the frequencies of the 
zero-point oscillations of the field could take on values as large as one wished 
and the formula obtained would not make sense. 

It turns out, however, that at frequencies larger than the minimum 
frequency of pair production w] = 2mc2/h an interaction arises between the 
zero-point oscillations of the field and the occupied background of negative 
energies (electromagnetic and electron—positron vacuums). This interaction 
can be pictured in an obvious way as the interaction between the fluctuations 
of the ‘current’ associated with the random displacement of the electron with 
positive energy and the fluctuations of ‘currents’ associated with random 
displacements of the electrons of the background of occupied states, caused 
by the action of the zero-point oscillations of the electromagnetic vacuum. 
Since by virtue of the Pauli principle all electrons tend to avoid each other 
(see §67), it tums out that the fluctuations of the electrons of the back- 
ground are in the opposite phase with respect to the fluctuations of the 
electron with positive energy. As a result they are mutually cancelled and the 
mean square displacement of the electron turns out to be considerably 
smaller than that given by formula (128.7). Simplifying the actual state of 
affairs, it can be said that at frequencies larger than w} the mean square 
displacement of the electron reduces to zero. 

Hence as the upper limit of integration over the frequencies of the field 
in (128.7') one has to choose the quantity w,,,, = . In defining the 
minimum frequency o,,;,, it is necessary to take into account that the 
electron considered is not free but is bound in an atom. 


§128 THE LAMB SHIFT 547 


The frequency w,,;, is in order of magnitude equal to the Rydberg fre- 
quency of the electron in the hydrogen atom, i.e. 


min = 9)~ 20R =, 


where R is the Rydberg constant equal to |Eg|/27h, and Ep is the energy of 
the ground state of the atom. Making use of these expressions for wmin and 





max We find 
2 2 2 2 2 
(r'2) = 2e*h n me? 2e ( h ) in me (128.8) 
nc?m2? fiw mhe \me hwo 
and for the shift of the term we find 
_ 4e4 (hh ii >. 2mc? f 
AE Lamb = 3he (a Wnt) in hwy ` (128.9) 


Formula (128.9) shows that the shift of the levels of the electron in the 
hydrogen atom under the action of the vacuum on the electron occurs only 
in s-states. Indeed, only in s-states is the quantity IW, (OI different from 
zero. This shift is always positive: the level in an s-state must lie above that 
defined by formula (119.2). 

The calculation of AE} amb is of an absolute character, and the numerical 
value of the shift (with certain additional corrections which are not taken 
into account in the simplified derivation presented above) is equal to 1057.19 
Mc. The experimental value of this quantity turned out to be equal to 
1057.77 +0.1 Mc. The perfect agreement between the calculated and mea- 
sured values of the Lamb shift is an obvious confirmation of the general 
concepts of the reality of the ‘vacuum’. 

Analogous calculations, on which we shall not dwell, made it possible to 
find the correction to the magnetic moment of the electron already men- 
tioned (see the article of Welton cited earlier). Of particular importance was 
the solution in quantum electrodynamics of a number of problems of prin- 
ciple. One succeeded in constructing a quantitative theory which made it 
possible to calculate with any degree of accuracy the probabilities of all 
possible processes associated with the interaction of electrons with each 
other and with the electromagnetic field. 

The difficulties of principle of the theory associated with diverging ex- 
pressions, for example the difficulty frequently mentioned that the intrinsic 
mass (or energy) of the electron goes to infinity, could, to a certain degree, 
be removed. In expressions for the intrinsic energy of a particle and in similar 
relations one succeeded in separating finite observable quantities, whereas 








548 SOME PROBLEMS OF QUANTUM ELECTRODYNAMICS Ch. 14 


diverging expressions describe only quantities which are in principle not 
observable. 

This procedure, called renormalization, cannot be presented here, and we 
refer the reader to the specialist literature, for example to the monographs of 
S.Schweber, H.Bethe and F.de Hoffman or A.J.Akhiezer and V.B.Berestetskii, 
which have frequently been cited before. 





15 





Fundamentals of the 
Theory of Elementary Particles 


§129. The classification and properties of elementary particles 


At the present time a large number (of the order of two hundred) of 
elementary particles have been discovered which can transform into each 
other but which do not consist, in the usual sense of the word, of smaller 
entities. They can be divided into two large groups: stable particles and 
short-lived particles (resonances). The first term is a relative one, since the 
group of stable particles contains, for example, both the electron which lives 
infinitely long, and the 79-meson, whose lifetime is of the order of 10716 sec. 
Stability is understood only in the sense that the lifetime of these particles 
is much longer than the characteristic time of 10-24—10-23 sec for a light 
signal to traverse a distance of 10-13 cm which is typical of the ‘size’ of 
elementary particles. 

The stable particles are divided into the following four classes: 

1. The class of photons, which comprises the quanta of, classical fields: 
This class includes the photon itself, a quantum of the electromagnetic field, 
and sometimes the graviton, a quantum of the gravitational field. 

2. The class of leptons (light particles) contains the electron, the muon, 
which was earlier called the y-meson, two neutrinos (the electron neutrino 
and the muon neutrino) and the four corresponding antiparticles. The elec- 


549 








550 ELEMENTARY PARTICLES Ch. 15 


tron neutrino is produced in the decay of the neutron, and the muon neutrino 
in the decay of the muon: 


n>p+e +n, , ->e +p, +v 
e e u 


(the bar distinguishes antiparticles from the corresponding particles). 

3. The class of mesons (particles of medium mass) comprises 3 7-mesons 
(m is the antiparticle of 7*, while the particle and antiparticle for 7° are 
identical), 4 K-mesons (two particles and two antiparticles) and 1 n-meson 
(which is its own antiparticle). 

4. The class of baryons (heavy particles) contains two nucleons (the 
proton and the neutron), 1 A-hyperon, 3 &-hyperons, 2 =-hyperons, 1 Q- 
hyperon and the corresponding antiparticles. 

The term ‘resonance’ arose in connection with the fact that data on short- 
lived particles were initially obtained from scattering experiments. Charac- 
teristic resonance maxima were observed in total cross sections at definite 
values of the energy of the scattered particles. Thus, for example, in experi- 
ments with m-meson scattering the existence of the p-meson was discovered. 

The group of short-lived particles comprises only baryon resonances and 
meson resonances —in all more than 150 particles. The most important 
among them are the mesons w (1), y (1), p (3), K* (4) and the baryon 
resonances Aj3¢ (4), 213g5 (3), =1539 (2) along with their antiparticles. 
Furthermore, the f-meson, which was theoretically predicted by Pomeranchuk 
and subsequently discovered experimentally, is of interest. 

The set of all mesons and baryons and their resonances forms a large 
group of particles which are at present called hadrons. Also the term 
hadenons, referring to leptons and to the photon, has begun to be used in 
Russian particle physics literature. 

Let us briefly enumerate the basic properties of elementary particles. 

1. Each particle possesses a rest mass, which is measured in MeV. The 
range of mass values of different particles is rather wide: from 0 (photon and 
neutrino) up to 3000 MeV and more (A339). The initial classification of 
particles (their division into the four classes mentioned above) was based only 
on the values of their masses. But it turned out that this classification, just 
as with the mass of atoms in the periodic system of elements, is rather loose. 
This is particularly clearly seen in examples of resonances. 

2. A very important characteristic of a particle is the value of its spin o. 
The photon has spin 1 (with certain reservations, since its rest mass is equal 
to zero). Leptons have spin 4, stable mesons spin 0, baryons (except for 2) 
spin 4, and the -hyperon spin 3. Among resonances there are particles with 
spin values from O up to 2. The spin of the meson resonances listed above 


§129 CLASSIFICATION AND PROPERTIES 551 


(except for f) is equal to 1, the spin of the baryon resonances is equal to 3, 
and that of the f-meson is 2. According to the Pauli—Lüders theorem, proved 
on the basis of most general principļes which do not depend on actual 
dynamics, the spin of a particle unambiguously defines the type of statistics: 
particles with half-integer spin (leptons, baryons and baryon resonances) obey 
Fermi—Dirac statistics, while those with integer spin (photon, mesons and 
meson resonances) obey Bose—Einstein statistics. Furthermore, the spin of 
the particle determines the transformation properties of its wave function 
with respect to Lorentz transformations. Spin zero particles are described by 
a (pseudo-)scalar wave function, spin one-half particles by a spinor wave 
function, spin one-half particles by a (pseudo-)vector wave function and so on. 

3. The parity P of a particle determines the transformation properties of 
its wave function with respect to the space inversion transformation. All 
stable mesons have odd parity and, having zero spin, are described by a 
pseudoscalar wave function. Meson resonances of spin 1 (w,y,p,K*) also have 
odd parity. Since under space inversion the vector components change sign, 
the wave function of these particles is vectorial. The parity of the f-meson is 
even. Baryons and their resonances can be assigned only relative parity. If it 
is assumed by definition that the parities of the proton, neutron and A- 
hyperon are even, then the parities of all the baryons enumerated above and 
of their resonances will also be even, whereas those of the antibaryons will be 
odd. 

4. Each particle is characterized by the value of its electric charge, which 
(the charge of the electron being assumed to be equal to unity) can take on 
only integer values. At present a large number of neutral particles and parti- 
cles with a charge equal in absolute value to unity are known. The charge of 
six particles (A-resonances) is equal to +2. 

5. In order to characterize elementary particles, the lepton number LZ and 
baryon number B are introduced. By definition, for leptons L = +1, B = 0; 
for antileptons L =—1, B=0; for baryons L=0, B= +1, for antibaryons 
L=0, B=—1; for mesons and photons L = B=0. In all reactions involving 
elementary particles these quantum numbers are conserved, and their im- 
portance is due just to this fact. 

6. All hadrons are divided into small families whose members are denoted 
by one and the same symbol (for example 7). These families are called iso- 
multiplets. The particles constituting an isomultiplet have about the same 
mass but different charges. Each isomultiplet is assigned a definite value of 
isospin T, which defines the number of members of the multiplet N = 2T + 1. 
Thus the isospin of the nucleon and of the K-meson is equal to 4, the isospin 
of the Z-hyperon and 7-meson is equal to J, that of the A-resonance is equal 
to # and so on. 





552 ELEMENTARY PARTICLES Ch. 15 


Table 3 


Stable elementary particles 





Spin 





























Class Particle nee and aia Basic decays pie ee 
parity 
ef 0 1 
oN y 5 œ = aa 
aa 
$ Ve 0 2 eS u Na 
5 J vy 0 2 co = = 2} 
ST Cy 0.511 Z C - - = 
aes 105.66 9 + DOs meres very aera 
a 139.58 2.6 x10 vy 
z? 134.98 Jo 0.89x 10'S yy [o 1 
m 139.58 2.6 X108 PEM 
AT ENK 493.8 1.24X 10 tytn; n*an 
53 Kg à 0:87X10!0 ata: nOn? ° 
oS o 0 +1 z 
=i TK 497.9 Sige 
0 -8 TNEVe; THY; 
K? 5.73X 10 O DO ggg 
548 0- 10°17 Tasman 0 0 
sy y n? nOn; n yy 
p 938.25 ÀA o = . 
1 o ia 
n 939.55 } R ~10? pe De } } 2 
A 1115.6 a 2.54X 10719 pas nn? =il @ 
Fa ye sai 1189.5 0.8 x107'° pr? nat 
Soe gs’ 1192.6 i <1 x10"* Ay jn 1 
Ad D 1197.4 1:65X 10% nr 
=° 1314.7 Lin, SETO OAO jea el 
= 1321.2 BS ESRI S N 2 
2 1674 a) ~1 x1071° =m AKt 3° 0 





ee | tees 


§129 CLASSIFICATION AND PROPERTIES 553 


Table 4 


The most important resonances 











Spin : 
A Mass Width ĉ 4 
Class Particle (MeV) and (Mev) Basic decays Ss TT 
parity 
w 783 i 12 nnn: n’y 0 0 
8 Y 1019 ji 4 KKK, Kginm n 0 0 
E 
= T? + 
Í oN f 1250 2 110 nn 0 0 
tA F " 
g l 
ea p* 774 : 
3 p? 780 hi bi2s ban bo ia 
a 
K* 892 r 50 Kr a 4 
PTA AEE 1236 3+ 120 Nr oF a 
5 z & 3+ 
>t Saas 1382 3 40 An; En = i 
agi 34 1 
= =1530 1530 z 7 =n -2 2 








4 The lifetime 7 is connected with the width T by the uncertainty relation AEA ~A, i.e. r =h/T. 











er a 


ne 


554 ELEMENTARY PARTICLES Ch. 15 


7. The different particles constituting an isomultiplet differ from each 
other by the isospin projection 73 onto the third axis of the fictitious 
isospace. Depending on the value of isospin T, its projection T} can take on 
integer or half integer values. The concept of isospin was initially introduced 
only for the nucleon and pion. In this case T} is related in the following way 
to the value of the electric charge of the particle: 


Q=T73+1B. (129.1) 


8. After the discovery of the K-mesons and hyperons (the so-called 
strange particles) it was necessary to modify the above formula: 


Q= T; +3(BtS) (129.2) 


(the Gell-Mann—Nishijima relation). The new quantum number S was called 
strangeness. In a wide range of phenomena, for example in hadron produc- 
tion reactions, the strangeness is conserved, which immediately made it pos- 
sible to explain certain incomprehensible features of these processes (say, 
the fact that strange particles are always produced in pairs). Recently instead 
of strangeness the physicist prefers to use another quantum number, the 
hypercharge Y, which is closely related to S: 


Y=Bts. (129.3) 


We shall not dwell on some other quantum numbers, which are introduced 
to characterize elementary particles (time parity and charge parity, G-parity, 
muonic charge and so on). 

A summary of properties is given in tables 3 and 4. 


§ 130. The types of interactions of elementary particles 


Elementary particles can take part in very different types of interaction: a 
particle may annihilate with its antiparticle, fast particles in collisions are 
scattered and new particles are produced, many particles are unstable and 
disintegrate and so on. At present four types of elementary particle inter- 
action, sharply differing from each other in strength and other properties, 
are known. 

1. The electromagnetic interaction: the interaction of charged particles 
with photons, and by means of them also with each other. Because of virtual 
processes neutral particles may also take part in the electromagnetic inter- 
action. Examples of reactions caused by the electromagnetic interaction are 
the transformations 





§130 TYPES OF INTERACTIONS 555 


e-tet>2y, 7 Hea y tee (Compton effect) , 
n9 > 2y, DID s+At+y, 


and so on. The electromagnetic interaction is of infinite range; times of 
10716—10714 sec are characteristic for reactions caused by it. The strength of 
the electromagnetic interaction is determined by the charge of the particle 
or by the dimensionless coupling constant which, in this particular case, is 
the fine structure constant a = e2/he = 1/137. The smallness of this constant 
allows the electromagnetic interaction to be considered as a perturbation, 
which explains the successes of quantum electrodynamics (see ch. 14). 

2. The strong interaction: the interaction of hadrons which is responsible 
for their scattering, for production reactions and for resonance decays. 
Typical examples are 


TEENETE ENE (i aha ADK 
ASN ET, DZ 2, 


and so on. The strong interaction is of short range, the radius of action being 
of the order of 10713 cm; times of 10724—10723 sec are characteristic of it. 
The intensity of the strong interaction is characterized by a parameter g, 
which is an analogue of the electric charge. For the interaction of m-mesons 
with nucleons the dimensionless coupling constant is equal to g?/fc ~ 14, 
so that the strong interaction is three orders of magnitude greater than the 
electromagnetic interaction. Therefore perturbation theory is inadequate for 
its analysis. Up to now there is no complete theory of strong interactions. 

3. The weak interaction: this is responsible for the slow decays of ele- 
mentary particles, for example 


D= E a EET Ys Pao 


K+ >n* +79 (0-decay), Kt>n* +r +77" (r-decay). 


It can in principle also give rise to other reactions (for example, the scattering 
of neutrinos by electrons), but such processes have extremely small cross 
sections and have not been observed experimentally. The weak interaction is 
of an even shorter range than the strong interaction (its range may be of the 
order of 10717 cm), and its characteristic times are 107!0—1076 sec. To 
characterize the intensity of the weak interaction one cannot introduce a 
natural dimensionless constant as in the preceding cases. This is associated 
with the fact that the analogue of electric charge, the so-called Fermi con- 
stant G, has a dimensionality different from that of e. The dimensionless 
constant of the weak interaction must involve a certain mass. If the mass of 








556 ELEMENTARY PARTICLES Ch. 15 


the 7-meson is taken, then G2(fc)-2(fi/uc) ~ 5X107!4, i.e. the weak interac- 
tion is less strong than the electromagnetic interaction by about 11 orders of 
magnitude. Nevertheless, strictly speaking, perturbation theory cannot be 
used in this case, because the weak interaction is non-renormalizable (see 
ch. 14). 

4. The gravitational interaction: this is characterized by an extremely 
small dimensionless coupling constant kM/ñc = 2X 10739 (k is the gravita- 
tional constant, and M is the mass of the nucleon), which allows one to dis- 
regard it at the present state of development of the theory of elementary 
particles. 

The interactions enumerated above differ not only in strength but also 
in their conservation laws, which is, apparently, even more essential. In all 
the interactions the energy, momentum, angular momentum, electric charge 
and baryon and lepton numbers are conserved. However, there are no uni- 
versal conservation laws for isospin T, its projection 73, strangeness S (or 
hypercharge Y) and parity P. 

1. The strong interaction is the most symmetric; in the reactions to which 
it gives rise all the quantum numbers mentioned above are conserved. In 
particular, the isospin conservation law is an expression of charge indepen- 
dence: all the terms of an isomultiplet behave in the same way with respect 
to strong interactions. 

2. ‘Switching on’ the electromagnetic interaction violates the equivalence 
of the particles contained in an isomultiplet, since they have different charges. 
Apparently, it is just this interaction which is responsible for the existence 
of the small differences between the masses of these particles. Since switching 
on the electromagnetic interaction specifies a definite direction in isospace, 
the total isospin T will no longer be conserved, but its projection T3 is still 
conserved. Also the laws of conservation of strangeness S and of parity P hold. 

3. The weak interaction is the least symmetric. None of the four quan- 
tum numbers mentioned above is conserved in the corresponding reactions. 
In particular, the parity conservation law is no longer valid (see ch. 13). 
Moreover, as shown by experiments with neutral K-mesons, the weak inter- 
action is apparently also not invariant under time reversal. 


§131. Symmetry groups in quantum mechanics 


We have already stressed that no consistent and complete theory of strong 
interactions has been developed up to now. In particular, dynamical equations 
describing the behaviour of particles under the strong interaction have not 





§131 SYMMETRY GROUPS 557 


been formulated. Therefore the study of the general symmetry properties of 
strong interactions assumes a special role. It allows one to obtain a satis- 
factory classification of hadrons and to derive a number of quantitative rela- 
tions. 

In §129 it was pointed out that all hadrons are divided into small families, 
isomultiplets, which can be assigned definite values of the isospin T. The 
members of a given multiplet differ in the isospin projection 73, which deter- 
mines the value of electric charge, and when the electromagnetic interaction 
is ‘switched off they have strictly the same mass. In strong interactions the 
quantum numbers 7 and 73 are conserved. 

We have frequently encountered analogous situations already. For ex- 
ample, when a non-relativistic particle moves in a central field its possible 
states are also grouped into definite sets which are characterized by different 
values of the angular momentum J. The wave functions belonging to the 
same set differ in the angular momentum component J} and corresponds to 
one and the same energy, i.e. form a degenerate energy level. In the motion 
of the particle the quantum numbers J and J} are conserved. 

The invariance of theory with respect to a definite class of transformations 
in real or in a certain fictitious space (in our quantum-mechanical example 
with respect to space rotations) is characteristic of all similar cases. The set 
of transformations is closed, i.e. successive application of the allowed trans- 
formations again leads to an allowed transformation. Furthermore, there is an 
identity or unit transformation, and to each transformation there corre- 
sponds an inverse transformation. Such invariance (symmetry) transforma- 
tions are said to form a group. Its elements can be denoted by g, and succes- 
sive application of two transformations g} and g, will be written in the form 
of a product gog, (in just this order). It is easily seen that all space rotations 
form a group —a three-dimensional rotation group, denoted by O(3). 
Lorentz transformations are another example of transformations forming a 
group. 

The concept of the invariance of a theory with respect to a given group of 
transformations involves two aspects: definite transformation properties of 
the wave functions y and definite transformation properties of the Hamil- 
tonian É. 

With respect to a group of transformations g the entire Hilbert space of the 
wave functions is broken into invariant sub-spaces, i.e. there are sets of wave 
functions which transform, according to a given law, only into each other: 


w' =U). (131.1) 








558 ELEMENTARY PARTICLES Ch. 15 


It is required that to the product gg, of the elements of the group there 
correspond the product of the operators U(g): 


U(g2,81) = U@2)UE))- (131.2) 


In this case the set of operators U(g) is said to form a representation of the 
given group. The dimensionality of the space (the maximum number of 
linearly independent wave functions) in which these operators act is said to 
be the dimension of the given representation. If the invariant sub-space does 
“not contain any invariant sub-spaces of lower dimension, then one speaks of 
an irreducible representation. Otherwise the representation is said to be 
reducible. In what follows the set of wave functions transformable according 
to an irreducible representation of a symmetry group will be called a multi- 
plet. 

From the theory of angular momentum it is known that in space rotations 
the spherical functions Y77(0, ) corresponding to a given angular momentum 
J and to all its possible projections J, transform only into each other. There 
corresponds to them an irreducible representation with dimension 2J+1 of 
the group O(3), i.e. all spherical functions corresponding to the angular 
momentum J form a (2/+1)-dimensional multiplet of the rotation group. 

let us consider the requirements which, in the invariant theory, are 

~ imposed upon the transformation properties of the Hamiltonian. We act on 
the Schrodinger equation j 


ay s 
nS, = AY (131.3) 


with the operator U(g) of the representation according to which the wave 
function W transforms. Assuming that U(g) commutes with the operator 
0/dt (we shall not need the analysis of the more general case), we obtain 


iñ aU) = UHU-} Uy 
or 
or 


in = UfU-Yy’ (131.4) 
The invariance of the theory means the identity of the form of the 
Schrödinger equation for the initial wave function W and for the transformed 
wave function Y’. Hence 


H = UHU-! 








§131 SYMMETRY GROUPS 559 


or 


[A,U(g)] =0. (131.5) 


Thus the requirement of invariance of a theory with respect to transforma- 
tions of a certain group leads to the commutativity of the Hamiltonian with 
all the representation operators of this group. 

We shall consider below only the so-called Lie groups, whose elements are 
single-valued differentiable functions of a finite number of real parameters. 
The latter are chosen in such a way that a unit element corresponds to their 
zero values. Thus for a Lie group: 


& = 8(A)5--A,) > g(0,....0) =. (131.6) 
The number n of all independent real parameters of the Lie group is said to 
be its dimension. If the transformation g differs infinitely little from the 
identity transformation, i.e. if it is infinitesimal, then there correspond to it 


infinitesimal values of the parameters a,. In this case, taking into account 
(131.6) one can write 


n 


=1+i2y al. (131.7) 


1 =...=0;=0 Z 


ACSEE Qn) 
ACIES De re 


k=] 





The ES 


_ Og (C1 5---,%,) 
lk = —1 —— 





å 131.8 
dag 1 =...=ay7=0 ( ) 
(the factor —i is introduced for convenience), called group generators, re- 
present square matrices whose order is equal to the dimension of the space in 
which the group transformations act. The rotation group generators are the 
3 X 3 matrices J,, Jy, J, presented in (51.17). To see this it is sufficient to 
take the matrices of finite rotations about the corresponding coordinate axes 
and to make use of the definition (131.8). 

It is obvious that the operators U(g) of the representations of the Lie 
group depend on the same parameters a,. Taking into account that to the 
unit element of the group WE corresponds the unit representation operator, 
one can write 


Ug) = Ulaz sp), UO,...,0)=/. (131.9) 


The N-dimensional representation operators corresponding to infinitesimal 
transformations have the form 











560 ELEMENTARY PARTICLES Ch. 15 

n n 

BU (Ce y---s0y) 
U@)=1+ Ls ap — p =1+i 2) al, - (131.10) 
k=l k 1 =...=a7=0 k=1 
Here the quantities 
0U(a,, ) 
7, Soa (131.11) 
dag aj=..=an=0 





called representation generators, are V X N matrices. For the rotation group 
such operators are, to within the factor +h, the operators Jy, disp I. of the 
angular momentum components. In particular, spinor representation gener- 
ators are the same as the Pauli matrices (see (61.13)). It can be shown that the 
operators U(g) of a given representation, which correspond to finite trans- 
formations, are of the form 


n 
ug)=exp(i Dot) (131.12) 


pe for example, formulae (61.13) and (61.12)). 
In §46 we have seen that only in the case of unitary transformations of 


the wave functions does the physical content of a theory not change, which 
singles out of the multitude of all representations a very important class of 
unitary representations for which 


utu=UUt =1. (131.13) 
From (131.13) and (131.10) we have 


n n n 
i= tt =(1- LoL (r Deptt Jari Dieppe 
Zork rk KK k) 
Hence 
ESI (131.14) 


ie. unitary representation generators are Hermitian matrices. Hence for 
infinitesimal transformations 


n 
UG) =I +i DD oLy ; Ut E)=U-l(g)=~I—i V agl - (131.15) 
kl 1 


We also note that for the Lie groups the relation (131.5), valid in the in- 
variant theory, is evidently equivalent to the condition 


[A,L,J=0. (131.16) 





: 
ES ne M O _ 





§131 SYMMETRY GROUPS 561 


The importance of studying symmetry groups in quantum mechanics is due 
to the following facts: 

1. There exist Hermitian mutually commuting combinations of the repre- 
sentation generators which also commute with them. They are called invariant 
operators, and their basic property is that on a given multiplet these oper- 
ators are simply multiples of the unit operators (the Shur lemma). This 
means that all the wave functions of one multiplet are eigenfunctions of any 
invariant operator with one and the same eigenvalue: Thus a natural set of 
quantum numbers arises equal in number to the invariant operators charac- 
terizing the eae as a whole. In the rotation group there is one invariant 
operator j2= ue +J2 +J?, so that in this case each multiplet is characterized 
by the value of ‘the ri momentum J. 

2. Among the representation generators there may exist several mutually 
commuting generators. Their number is determined by the properties of the 
group and is said to be its rank. The basis functions of a multiplet can be 
chosen in such a way that they are eigenfunctions of these generators. The 
corresponding eigenvalues are the quantum numbers classifying the wave 
functions belonging to the given multiplet. In the rotation group there are no 
mutually commuting generators, i.e. its rank is equal to 1. Therefore wave 
funcitons with a given angular momentum can be assigned only one more 
quantum number, for example the value of the componentJ,. 

3. The generators and invariant operators described form a set of 
Hermitian operators which commute with each other and with the Hamil- 
tonian (see (131.16)). Hence there correspond to them conserved and at the 
same time measurable physical quantities. A consequence of the theory of 
invariance with respect to space rotations is the conservation of angular 
momentum J and of its component J,. 

4. From the above it follows that if the wave function of the initial 
state of a system belongs to a certain multiplet, then as a result of a reaction 
(scattering or decay) the system will make a transition to a new state whose 
wave function belongs to the same multiplet. This establishes definite selec- 
tion rules for the reactions. 

5. From (131.16) and the Shur lemma it follows that the eigenvalues of 
the Hamiltonian (the values of the energy or mass of the elementary particles) 
are the same for the wave functions of a given multiplet. This accounts for 
the presence of degeneracy and allows one to establish the multiplicity, which 
is equal to the dimension of the multiplet. In a theory invariant with respect 
to the rotation group there occurs a degeneracy in J,, its multiplicity being 
equal to 2/+1. 











562 ELEMENTARY PARTICLES Ch. 15 
§132. The isogroup SU(2) and its representations 


The existence of hadron multiplets with the properties already described 
suggests that the strong interaction of elementary particles is invariant with 
respect to a certain group of transformations. It turned out that this latter is 
the group SU(2), which we shall henceforth call the isogroup. From the 
mathematical point of view it is closely related (is almost equivalent) to the 
rotation group O(3). 

By the group SU(2) is meant the set of all unitary and unimodular (deter- 
minant equal to 1) 2X2 matrices: 


gig=gg' =l, detg=1. (132.1) 


We can make a useful geometrical interpretation of unitary 2X2 matrices. We 
take the two-dimensional complex space of vectors x, which we shall write in 
the form of columns, and introduce the Hermitian conjugate vectors xt: 


X 
=| a)i xi = (x1, x3) = (xl, x?). (132.2) 
259) 
In this space we consider the linear transformation 
x'=gx or x; = glx; (132.3) 


(the superscript numbers the column, while the subscript numbers the row; 
they run over the values 1, 2, repeated indices implying summation). Second- 
order unitary matrices correspond to the transformation matrices (132.3), 
which do not change the quadratic form 


xtx =xix;= xix] +xZxp - (132.4) 
According to §131 the infinitesimal matrix g can be written as 
grItie,r, - (132.5) 


Here 7, are the generators of the group SU(2), and e, are its parameters 
whose number n (the dimension of the group) is to be defined. The require- 
ment of unitarity of the matrix g leads to the hermiticity of its generators Tą- 
Noting that, to terms of the second order of smallness in €,, for the infinites- 
imal matrix (132.5) 


1=detg ~] +ie, Trt, 


we arrive at the conclusion that the trace of the generators T, is equal to 
zero. Thus they are 2 X 2 matrices with the properties 








§132 THE ISOGROUP SU(2) 563 


i =e~ Trt, =0. (132.6) 


a a 


These restrictions impose 5 (4 plus 1) conditions upon the 8 real parameters 
of the complex 2 X 2 matrix. Hence there are 3 independent matrices with 
the properties (132.6), which just determines the dimension of the group 
SU(2). One can take as its generators the Pauli matrices, which are Hermitian 
and have trace zero: 


Om al 0 -i 1 0 
n=(, a) n=(, al nh =) (132.7) 


Among these there are no mutually commuting ones, so that the rank of 
SU(2) is equal to 1. The sum of the squares of the Pauli matrices commutes 
with each of them: 

[r?,7,] =0, where 77= Ti +73 + T3 ; (132.8) 


Let us now go on to the construction of the representations of the group 
SU(2). 

1. The trivial representation is the simplest one. Its multiplets are one- 
dimensional, i.e. contain one wave function y each, which does not change 
under the transformation (132.3): 


y=. (132.9) 


The dimension of this representation is 1, and its generators are the numbers 
0. Such a representation is called a scalar representation, and g is said to be 
a scalar. 

2. Let us consider the two-dimensional space of vectors y;, which trans- 
form according to the same law as the vectors x: 


y; = aly; or y; © (Iie, Ta Ne; - (132.10) 
J 1j 


As a result, two-dimensional multiplets arise whose members transform ac- 
cording to the representation of the same dimension. From (132.10) it 
follows that its generators are the matrices 7, themselves: 


(La) = (Ta) - (132.11) 


This representation is called the spinor representation, and the quantities y; 
are said to be spinors (see the formalism presented in ch. 8). 

3. Let us take two quantities yf which transform in the same way as the 
components of the vector x7: 


(y= Gre! or YÒ  C-iegta iy! . (132.12) 








it Me ea me 


564 ELEMENTARY PARTICLES Ch. 15 


As a result we obtain a two-dimensional representation, which is said to be 
conjugate to the spinor representation. Its generators are the matrices 


C=C} . (132.13) 


In fact this representation is equivalent to the spinor representation, i.e. 
from the quantities y! it is possible to form linear combinations which 
transform according to the law (132.10). But its introduction is very con- 
venient from the formal point of view. 

4. All other representations can be constructed from the spinor represen- 
tation and its conjugate. Let us consider the set of 2(p+q) quantities gji: -Ja 
which transform as the product of p spinors and q conjugate spinors, ie. 
according to the law 


(ay = sgi Ea (gt) yee KE) (132.14) 
=p -ip 
For the e a transformation we ne 
pete iaj? 
O = (Itie,, a „iep 75 DT ioe (132.15) 


so that the generators of this representation have the form 


Ca)” a DENG  aipai gla 


lg am is ist] sË ip ji jq 
ED ai Pa. o o yalt, 6/4 t (132.16) 
iip j Js— Js Jst Jq 


s=] 


The representation obtained in such a way is said to be the direct product of 
p spinor representations and q conjugate representations. Symbolically 


(1,0) ® ... @(1,0) @(0,1) @... @(0,1) 


p times q times 


(the notation is obvious). 


If the function gi Iq is symmetric with respect to a certain pair of sub-- 


ty -ip 
scripts (or superscripts), then this property is conserved under the trans- 


formation (132.13). Furthermore, the invariance of the quadratic form 
(132.4) Jeads to the conservation of the trace with ere to any pair in 





§132 THE ISOGROUP SU(2) 565 


of all functions oe is divided into individual multiplets whose members 
Ty. 


transform only into each other, i.e. the representation constructed is re- 
ducible. In the given case the condition of irreducibility amounts to the fact 


that f1--1q must be symmetric separately with respect to all the subscripts 
iki parately P P 


and superscripts, the trace with respect to any pair in the subscripts and super- 
scripts being bound to reduce to zero. It is easy to count the number of 


independent quantities Ge of such a type, i.e. to determine the dimension 
NE 


of:the corresponding irreducible representation. The total number of sub- 
scripts (superscripts) which are equal to 1 can vary from 0 up to p (up to q), 
the other indices being equal to 2. Under the condition of symmetry the 
order of succession of the indices does not matter, so that the number of 


different completely symmetric quantities gia is equal to (p+1)(qt1). 


iip 
The fact that the traces are equal to zero imposes upon them pq conditions, 


so that in all we have 
N= (p+1)(q4+1)—-pq =p +q+1 


independent components. The irreducible representation described will be 
denoted by the symbol (p,q), and its dimension by (p,q). We have 


N@.q)= (ptq) +1. (132.17) 


We stress once more that the superscripts and subscripts are equivalent, so 
that the irreducible representation is in essence defined by the total number 
of indices, which is equal to p+q. 

Let us consider the example (important for what follows) of the direct 
product of the spinor representation by the conjugate representation, i.e. the 
representation (1,0) ® (0,1). Functions transforming according to this repre- 
sentation form a square 2X2 matrix. In correspondence with (132.13) 


(Wy =i Gt Yol. (132.18) 
For infinitesimal transformations (132.15) gives 
(yt) ~ MtieaTa); Cie gre, 


z~ {5 ; Siti€, la); 51 (Ta )5; }}e) (132.19) 


so that the generators of this representation have the form 








566 ELEMENTARY PARTICLES Ch. 15 


Caii = (7); -Sia 7 (132.20) 


To decompose it into irreducible representations it suffices to separate from 
the matrix CA the non-zero trace, i.e. to rewrite it in the form 


e= eale Halet 03221) 


The last term is invariant with respect to transformations from the group 
SU(2), i.e. is a scalar. The trace of the matrix standing in parentheses is equal 
to zero. Jt contains three independent components transforming according 
to the three-dimensional irreducible representation which is called the vector 
representation. Symbolically the expansion (132.21) is written in the form 


(1,0) ® (0,1) = (1,1) @ (0,0). (132.22) 


From (132.15) and (132.8) it follows that for the group SU(2) the quantities 
L? are invariant operators: 


[Ê2, Ê ]=0 where L?=1? siR ohz, (132.23) 


Making use of (132.16) and (132.7) it can be shown that any matrix y" iq 
7 LEED. 
of the multiplet (p,q) is an eigenfunction of the operator L?, where 


Èg Z Ota Npr DA è (132.24) 
le P sA ese 
The eigenvalues of the operator L? depend only on the type of multiplet 


(more precisely, only on its dimension) and are its characteristics. 
To classify the basis elements of a multiplet, which can be chosen to be 


combinations of the matrices gia with only one non-zero element, use can 
Ty tp a 
be made of the eigenvalues of the diagonal operator L3. Since the rank of the 


group SU(2) is equal to 1, this quantum number is sufficient. From (132.16) 
and (132.7) we conclude that if among p subscripts there are py and among 
q superscripts gz indices equal to 2, then 

L304 = [(p-q)-202-42)14'" 4 . (132.25) 

1--!p tH lp 

The neighbouring eigenvalues of the generator lin which correspond to the 
basis elements of a given multiplet, differ by 2. From (132.25) it is seen that 
the minimum of these is equal to —(ptq) and the maximum to +(p+q), so 
that there are in all ptqg+l independent terms of the multiplet, and we again 
arrive at formula (132.17) for the dimension of the irreducible representation. 





§133 ISOMULTIPLETS 567- 
§133. Isomultiplets of elementary particles 


We now suppose that the strong interaction of elementary particles is 
invariant with respect to the group SU(2). This means, first of all, that the 
wave functions of hadrons transform according to certain of its irreducible 
representations, i.e. they are the products of the ordinary wave function, 
depending on spatial coordinates and on the spin projection, and the iso- 
matrix: 


y= Vezda i ; (133.1) 


In this case an individual hadron can conveniently be compared with a wave 


function in which the matrix gid is one of the basis elements. Thus all 
iiei 


hadrons turn out to be distributed in isomultiplets, which are characterized by 
the eigenvalues of the generator L?. Individual hadrons within an isomultiplet 
are classified by the eigenvalues of the generator L4. In consequence of the 
assumed invariance of the strong interaction with respect to the isogroup, 
these two quantum numbers will be conserved in all reactions due to it. 
Furthermore, quantum-mechanical degeneracy must occur, i.e. the masses of 
the particles belonging to one isomultiplet will be strictly the same. When 
the electromagnetic interaction, which is not considered to be invariant with 
respect to the isogroup, is switched on, this degeneracy is removed and the 
isomultiplets split into individual particles with somewhat different masses. 
An analogue of this is, for example, the Zeeman effect (§74), in which the 
application of an external magnetic field violating the invariance with respect 
to the rotation group removes the earlier degeneracy of levels with respect to 
the angular momentum component J,. 

We note that for hadrons involved in the same isomultiplet all quantum 
numbers (spin g, parity P, baryonic number B and strangeness S or hyper- 
charge Y) differing from the eigenvalues of L} must be the same. This follows 
from the fact that the corresponding operators commute with the generators 
Lx (i.e. are invariant operators), and from the Shur lemma. 

Instead of the generators L, it is convenient to introduce the operators 


A 


fT, = iLa (133.2) 


which are called isospin component operators. Then as an invariant operator 
it is natural to take the operator of the square of the isospin 


N =W MR END SR | (133.3) 








568 ELEMENTARY PARTICLES Ch. 15 


Formulae (132.24) and (132.25) now assume the form 


. 


Pg" = (pra) prq +2 4 = NTH), (133.4) 
ii- ip iip 1--Ip 

aS SIE epee yeti z 

Papa = [L(P-q)-2-42) 94 = Tapa. (133.5) 
ii -ip i1 .-ip ii -ip 


The quantum number 


T= 3(p+q) (133.6) 
classifies the hadron multiplets and is called the isospin, and the quantity 
T3 = 3-4) —(P2-42) (133.7) 


classifies the basis elements of the isomultiplet, i.e. the individual hadrons 
belonging to it. It can take on values from —T up to +T, and is said to be the 
third component of isospin. Formula (132.17) for the dimension of the 
irreducible representation, i.e. for the number of different particles involved 
in the isomultiplet, takes the form 


M(p.q)=N(T)=2T +1. (133.8) 


For the wave functions which are the eigenfunctions of the operators T2 
and T with eigenvalues T and T} we shall henceforth sometimes use Dirac’s 
notation |T, T3). Finally, in correspondence with the empirical Gell-Mann— 
Nishijima relation (§129) we shall assume by definition that the charge 
operator is 


Ô= T; +1(B+S)Î = Î; +4YÎ (133.9) 


where Î is the unit operator in the space of functions Coane 
Aes 

The above relations point to a close relation of the ane SU(2) to the 
rotation group O(3), the isospin T being a complete analogue of the angular 
momentum J. This relationship allows one to make use of the entire mathe- 
matical apparatus of angular momentum theory presented in §§5] and 52. 
In particular, the formalism of the Clebsch—Gordan coefficients ($52) is 
applicable for the actual decomposition of the direct product of representa- 
tions into irreducible representations. 

Let us now consider actual isomultiplets of hadrons. The proton and 
neutron have the same masses and are identical with respect to strong inter- 
actions. They form an isodoublet, i.e. their wave functions are of the form 


§133 ISOMULTIPLETS 569 


tela) (a) =(6) (°) 
y = = = =N,; P =y 
aT N 0 ya n= Vn 


The wave function of a nucleon is written as 


A (133.11) 


n 


Ill 


(=n . (133.10) 


Under the normalization condition 


2 f av(wi2+in|2)= 1 (133.12) 


1 
S72 


the two terms of this expression corresponding to the proton and neutron are 
interpreted as the probabilities of observing the nucleon in the proton state 
and in the neutron state. The spinor representation has dimension 2, so that 
the nucleon isodoublet is to be assigned isospin }. The generators of the 
representation are the matrices 7,, so that the isospin components T3 are 
the eigenvalues of the operator 373. To find them use can be made of 


formula (133.7), but in the given case they are easily determined directly: 


A _HC)=20). 
ACH 30). 


so that for the proton 73 =+5, and for the neutron T} =—}. In corre- 
spondence with (133.8), taking into account that for the nucleon B= 1, 
S = 0, we obtain 


(133.13) 


ge ee E ih glee 7ipom() 
On = 73 +3/= 14) =( Ih (133.14) 


so that 


(133.15) 








570 ELEMENTARY PARTICLES Ch. 15 


Thus the charges of the proton and neutron are equal respectively to +1 and 
0, as is to be expected. 

The =-hyperon, the =} 53ọ resonance, the K-meson and the K*-meson 
resonance are also isodoublets. The relations written above are valid for them, 
with the only difference that for the =-hyperon and its resonance the hyper- 
charge is Y = —1 (B=1, S=—2), so that for these particles the charge operator 
is equal to 


ah a ee 2 (O° 0 
-= sied ž ). (133.16) 


y;=("); z(2), Eea -(- pe): 


1530 
Kt Ko 
ka) Ki e 


It is natural to assume that the wave functions of the antiproton and anti- 
neutron transform according to the representation conjugate to the nucleon 
representation, i.e. 


Vp = ¥5(1,0)= (V5.0) = (P0) , 


(133.17) 


(133.18) 
Wa = Y0, 1)= (0,7). 
Hence the wave function of the antinucleon is written in the form 
Ni = (p,n). (133.19) 
The isospin of the antinucleon is also ganal to 4, but, because all the gener- 
ators haye now changed sign, is = 273 and hence for the antiproton 
T3 =—3, and for the antineutron 73 = +) (it should be noted that, according 


to the wiles of matrix multiplication by a wave function having the form of a 
row, the operators which are square matrices act from the right and not from 
the left). It is obvious that the antinucleon charge operator On is equal to 
=05 so that the antiproton has Q = —] and the antineutron Q = 0. All these 
statements may also be applied in an obvious way to the isodoublets of other 
antiparticles. 

There exist three 1-mesons: n+, 79 and 17 with about the same masses, 
identical with respect to the strong interaction It is natural to assume that 


a 


§ 133 ISOMULTIPLETS 571 


they form an isotriplet, i.e. their wave functions transform according to the 
vector representation, forming a second-order matrix with trace zero: 


yah | tie 
ni -( ji (133.20) 


1 2 
T3 73 


The condition Trai = O requires that 


nl =n. (133.21) 
To the isotriplet there corresponds the isospin 7 = 1. Making use of formula 
(133.7) we find that to the components a, n? and n} there correspond 
respectively 73 =0, 73 =+1 and 73 = —1. Since for 7-mesons B = S = 0, then 


Ô, = T3 (133.22) 
i.e. their charges are the same as the values of the component 73. Hence 
n! =n ~n, mi ~ at, nå ~r. (133.23) 


By requiring that the isomatrices involved in the wave function of each m- 
meson be normalized to one, i.e. that they form the orthonormalized basis of 
the isotriplet, and taking into account (133.20) and (133.23), we finally ob- 
tain for the total wave function of the 7-meson 


Be hasan) 


Taking into account that the components n} transform in terms of each 
other according to the law (132.18) and making use of the properties of 
unitary unimodular matrices, it is easily shown that the wave functions 

1 = n? A 70 = (n! n3) V2 ` m= nh (133.25) 
transform as the ordinary components of a three-dimensional vector. This 
accounts for the name of the representation (132.1). Hence the wave func- 
tion of the 7-meson can be written in the form 


(133.24) 


n=| 79 |. (133.26) 


In such a formalism the generators of the vector representation and the 
isospin component operators 73 will be Hermitian 3 X 3 matrices with trace 
zero. Making use of (51.17) we can write immediately 





——————————— << el rr E ee 





572 ELEMENTARY PARTICLES Ch. 15 
010 0-i 0 10 0 
a ih ] E |e ME a i 
a O E TRE Oo o Of. (33.27) 
010 O iO 0 0 —1 
By direct check we see indeed that 
nt nt 0 0 0 0 
T;{0 |=|0 |; T;| 70 |=| 0 |; î,| o |=- o |. (133.28) 
0 0 0 0 T 1 


The results obtained apply automatically also to.other hadron isotriplets: 
2, p, Z13g5- Since in these cases Y= 0, the charge operator also has the 
form (133.22). 

Since the representation conjugate to (1,1) coincides with itself, and 
their generators are also the same, the structure of antiparticle isotriplets 
is identical with that of the matrix (133.24). For example, 
= (ee? = ) 

a SNP) 

(we note that the charge of X~ is positive, and that of Et is negative). For 
mesons there are no quantum numbers, except of the component 73, which 
would distinguish between particles and antiparticles (for the Z-hy peron such 
quantum numbers are the baryonic number and strangeness). Hence the anti- 
particle with respect to m* is m~ and vice versa, while in the case of the 79- 
meson the particle is identical with the antiparticle. This accounts for the fact 
that for m-mesons the particles and antiparticles are contained in the same 
isomultiplet, whereas, say, © and È form two different isomultiplets. An 
analogous situation also occurs in the case of p-mesons 


(oh Di 
pP- -p02 (133.30) 

The four nucleon resonances A**, A+, A0 and A- form an isoquartet i.e. 
their wave functions transform according to the irreducible representation of 
dimension 4, forming the matrix Aj; symmetric in any pair of indices. To 
the isoquartet there corresponds the isospin T= 3. From (133.7) it follows 
that for the components A, ,;, 4112 = 4121 = 4211: 4212 = 4221 = Aj22 and 
Ayo the isospin component T} is equal respectively to +3, +3, —} and —3. 
Since for the A-resonances Y = 1 (B=1, S=0), then 


(133.29) 


pe 


y ) where pt=p-; p-=p*; p= 0 


§134 THE NN, NN AND 7N SYSTEM 3 573 


Ô, = Î;+4Î (133.31) 
so that 

Aj vA, Aji = âz = 421 ~ A, (133.32) 

A122 = A212 = 4221 ~ 4, A22 ~ A. 


From the normalization condition it follows that in the first and last relations 
the factor of proportionality is equal to 1, while in the second and third 
cases it is equal to 1/\/3. The isoquartet A is filled in an obvious way. 

There are hadrons, for example ^A- and Q-hyperons and n-, w- and y- 
mesons, which are isosinglets, i.e. their wave functions transform according 
to the isoscalar representation. Since its generators are equal to zero, T= 
T3 = 0 for an isosinglet and Q = +Y. The antiparticles 7, ©, @ are the same as 
the corresponding particles. 


§ 134. The wave functions of a system of nucleons and 7-mesons 


Let us now consider the three simplest composite systems: NN, NN and 
nN, which are of considerable interest. 
1. We decompose the wave function of the system nucleon—antinucleon 


yl = NIN; (134.1) 
into the irreducible parts (§ 132): 
gi = (NN,-151NEN,,) + LSINEN;, = x} + 51x . (134.2) 


The last term is an isoscalar, while the expression standing in parentheses is 
an isovector. Recalling the analysis of the matrix (133.20) we obtain 


[1,+1) =ax{ =anp , 11,0) = bx; = b}@p—in) , 
(134.3) 
|1,-1) = cx =cpn , 10,0) = dx = d} (pp+nn) 


where a, b, c and d are normalization coefficients. Assuming that the spatial- 
spin parts of all the wave functions W/ and N; are normalized to unity, from 
the condition of orthonormalization of the basis states (134.3) we have 
a? =c2 = 1, b? =d? =2. Thus, finally, 











574 ELEMENTARY PARTICLES Ch. 15 


|1,+1) = 7p, 11,0 = @p—in) 
(134.4) 
{1,-l=pn, 10,0) = 5 (pp sin) . 


If the system NN is in the 1g,-state (spins antiparallel), then its total spin 
is equal to O, and the parity is odd (the relative parities of the particle and 
antiparticle are opposite). Thus one can construct from the nucleon and anti- 
nucleon a pseudoscalar isotriplet and a pseudoscalar isosinglet, i.e. a 7-meson 
and an 7-meson. ‘This result underlies the composite model of the 7-meson, 
proposed in 1949 by Fermi and Yang. The isotriplet state 38, of the NN pair 
can be compared to the p-meson. 

2. Considering the wave function of a system of two nucleons 


Gi = N;N; (134.5) 
we can separate it into parts symmetric and antisymmetric with respect to 
the indices: 

Vij = NjN;" = 1(V{N;"tNIN;') + 1 (NjN;'-N;jN;’) =X] t X {ij} - (134.6) 
The second term contains one non-zero independent component (/=1,/=2), 


i.e. an isoscalar, while the first term is an isovector. Making use of formula 
(133.7), we obtain 


[1,+1) =ax,, =ap'p"’, [1,0)=5 =bh(p'n"tn'p") , 
gue P X12] = ba Aes $ : (134.7) 
[1,-1) = cx99 = en'n" , |0,0) = dX {12} =di(p'n"—n'p"). 
The normalization coefficients are defined as before. We have 
I+1) =p'p", LO=—5 a d 
(134.8) 
(1, —=D=n'n", (0.0=5 (p'n"'—n'p") 


(see relations (66.4) and (66.5)). The first three functions, corresponding to 
isospin 1, are symmetric, whereas the last function, corresponding to the 
isoscalar system pn, is antisymmetric in the isovariables. 

In the given formalism the proton and neutron are considered as two 
states of one particle, the nucleon. Hence the total wave function of a sys- 
tem of two nucleons considered as identical particles must possess definite 
symmetry properties with respect to their exchange. Since the type of sym- 
metry does not depend on what pair of particles is exchanged, we exchange 





§ 134 THE NN, NN AND 7N SYSTEM 575 


two protons. They obey the Pauli statistics, i.e. when their coordinates and 
spins are exchanged the wave function changes sign. On the other hand, 
from (134.8) it is seen that when the isovariables of two protons are ex- 
changed the wave function does not change. Thus nucleons obey the gener- 
alized Pauli principle according to which the total wave function of a system 
of nucleons is antisymmetric with respect to the exchange of any of their 
pairs. Hence it follows, in particular, that the wave function of a system NN 
with T= 1 describes a state whose angular momentum differs from that of 
the isoscalar (7=O) state. 


3 


3. From the wave function of a system of a -meson and a nucleon 
of =Nink (134.9) 


we separate the parts symmetric and antisymmetric with respect to sub- 
scripts: 


kes pear : 
oh =i (Ny) KAN ink + 1(N; a Nim) = oiin + Oy - (134.10) 
The second term contains two non-zero a eee components P12): i.e. it 
is an isospinor (T=3). But the first matrix is still reducible, since its traces 
are not equal to zero. Separating them, we get 
yi = Link +N mE —16*N ni? — 15EN nT”) + 


mej 
+ LOKN,, 1)" 48*N,, 11") = xk + (OF x;t8Fx;). (134.11) 


The second term represents an isospinor. It is easily seen that its components 
are the same as the two components eA} From (133.6) we conclude that 
the first term corresponds to isospin T= 3, i.e. that it contains four inde- 
pendent components. Making use of (133.7), we have 


NN 


S03 ND 3. ipl al 3 7S 
13.42? ~ XI > 13,42? ~ XI > 5-3) ~ X20 > 


(134.12) 


nie 


12, -2)~xd0, tS > bo 


Writing the components explicitly and defining the factors of proportionality 
from the normalization conditions, we finally obtain 


13,42) =prt , 13,41) =V2pn9 -Vinn , 13,3) = JS} pa +2 nn 
Dn, lti) = Vpr? + 2nnt , 1b) = Vpr- -Vinn . 


(134.13) 


One can also arrive at this result by means of the formalism developed in 
§§51 and 52. According to the rules of addition of angular momenta the 





576 ELEMENTARY PARTICLES Ch. 15 


isospin of a system consisting of a 7-meson (T=1) and a nucleon (7=}) can 
take on the values 3 and į. To construct the wave functions of the system 
corresponding to definite values of T and 73 use can be made of the general 
formula (52.3), which in our notation is written as 


= T: 
miy D G 
13=43 


Taking the Clebsch—Gordan coefficients from the table 1, §52, we have 


—tyt,!1,73—t3)1z.t3) - (134.14) 


13,49) = Ci LADE +D = pat, 

15,43) = Ge [1,415 ,—3) + cà ll O14 44) = V2 p09 —V5nn* , 
3,4 = G, {1,013 ,—4) + Cig D543) = Vi pa- + Venn 7 
BP =Ch, lD, 

tt) = Gig 11,412 4) + Ci ILO} +) = Vpn +2nnt , 
I$) = C ILO) + CÈ LIE 5 Vpr -Vinn . 


Thus we again drrive at relations (134.13). 
One can easily reverse them and express the wave function of a system 
consisting of a m-meson and a nucleon in terms of the functions |T,T3): 


prt = |3,43), 

pT = V5) + V5) 
pr? = V5) + V3 la.42) 
nnt = —/1)3 +1) +V, +), 
nT = |3,-3), 


nn? = Ji} 3) -= V5. 


(134.15) 


§135. Isotopically invariant interaction 


In §67 we pointed out that nuclear forces, i.e. the forces acting between 
two nucleons of the atomic nucleus, possess the property of charge inde- 
pendence. This property is most simply described within the framework of 
the isospin formalism. The hypothesis of charge independence of nuclear 





§135 ISOTOPICALLY INVARIANT INTERACTION 577 


forces says that the interaction of two protons, two neutrons and of a proton 
with a neutron in the same spatial-spin States is identical. From the gener- 
alized Pauli principle it follows that if the two nucleons are in the isotriplet 
state (symmetric with respect to isospin variables), then their coordinate 
wave function is antisymmetric. For the proton and neutron in the isosinglet 
state, the spin-coordinate part of the wave function is symmetric. This means 
that the interaction of two nucleons in states |1,+1), |1,0) and |1,—1), other 
things being equal, will be the same, whereas the interaction in state |0,0) is, 
in general, essentially different. 

As an example let us consider the deuteron, which consists of a proton 
and a neutron. The coordinate wave function of its ground state (/=0) is 
symmetric, and the spin wave function is also symmetric; the total angular 
momentum of the deuteron is equal to 1, i.e. it is in the triplet spin state 
38, with a small admixture of state 3D, (see §76). Hence the isospin part 
of the wave function must be antisymmetric and form an isosinglet. It is 
known from experiment that the binding energy of the deuteron is equal to 
2.23 MeV. On the other hand, if the deuteron were in the state tSo; then 
its isospin wave function would be symmetric, i.e. it would be an isovector. 
In this case there are no bound states. 

Thus, under the assumption of charge independence, nuclear forces are 
determined by the total isospin 7, rather than by its component 73. Conse- 
quently, in the phenomenological theory of nuclear forces, in which the 
interaction between nucleons is described by a certain potential, the Hamil- 
tonian may contain only the invariant operator 72, the square of the total 
isospin of the system of nucleons, which can be written in the form 


72 = (T+T) =2(e' te) = be 24h 24 Mee”). (135.1) 


Here t'= {7',75,73 } and t” = {7,753,753} are Pauli matrices acting on the 
isospin indices of the first and second nucleons respectively. Since the 
operators t’2 and t”? are multiples of the unit operator, the dependence of 
the interaction Hamiltonian on the isospin variables is defined only by the 
scalar product (t':t’’): 


Hint =U, + Up ("4"). (135.2) 
Here U, and U, depend only on the coordinates and on the ordinary spin; 
they are written down in §76. 
It is easily verified that the Hamiltonian (135.2) commutes with the 
isospin component operators 7, and thus also with the operator T?: 


lint To] = [HinyT?] = 0. (135.3) 








578 ELEMENTARY PARTICLES Ch. 15 


Consequently, if the Coulomb interaction of protons and the small difference 
in the masses of the proton and neutron are disregarded, then in the system 
of interacting nucleons there holds not only the law of conservation of the 
isospin component, expressing the trivial fact of charge conservation, but also 
the law of conservation of the total isospin, which may serve as a formulation 
of charge independence. 

Let us now turn to the interaction of nucleons with 7-mesons. As we 
know from §67, it is just this interaction which is responsible for the 
existence of nuclear forces, i.e. for the interaction of nucleons, which we 
have considered above purely phenomenologically. A logical discussion of 
the corresponding problems is possible only within the framework of the 
quantum field theory. Therefore we shall confine ourselves to some com- 
ments. Analogous to what we did in §122 in describing the weak interaction, 
we shall consider the nucleon function N; and the 7-meson function nf as 
Operators in the space of occupation numbers. The operator N; corresponds 
to the destruction of a nucleon and to the creation of an antinucleon, where- 
as the operator (Nİ ); creates a nucleon and destroys an antinucleon. The 
operator mi creates and destroys m-mesons. 

The bese processes of the interaction considered are those of virtual 
production and absorption of m-mesons by nucleons (see §67). Hence the 
interaction Hamiltonian density must have the general structure V1 N7. Since 
the m-meson operator is a pseudoscalar, relativistic invariance and parity 
conservation require that it be multiplied by the pseudoscalar combination 
of Ni and N. Recalling the results of §121, we arrive at the expression 
Ny5Nn. In consequence of the charge independence of the nN interaction, 
the kemion density must be an isoscalar. One can form from the matrices 
Ñi, N; and Th only one combination of this type: NN, ni. Thus, finally, 


int = V2gNiy5N (135.4) 


where g is the strong 7N interaction constant (an analogue of the electric 
charge). The factor \/2 is introduced for historical reasons. 

Making use of the explicit form of the nucleon matrices and the 7-meson 
matrices NV’, N; and Th (see (133.17), (133.18) and (133.24)), from (135.4) 
we obtain 


Hint = glV/2pysnt* /2nyspn-+(pysp—nysn)n] 3 (135.5) 


Consequently, on the assumption of charge independence the following rela- 
tion holds between the constants of the 7N interaction: 





§135 ISOTOPICALLY INVARIANT INTERACTION 579 
24 ele eee 135.6 
Spnat ? Spnn-?8ppn° Enno = 1:1 p eaa 135.6) 


Taking into account the explicit form of the Pauli matrices and passing 
to the m-meson vector function m (133.26), it is easily verified that the 
Hamiltonian density (135.4) can be rewritten as 


Hint = 8NYst Nn (135.7) 


where it is assumed that the isovectors t and m are multiplied in the scalar 
way. In older studies only this form of notation was used, which accounts 
for the appearance of the factor \/2 in expression (135.4). The interaction of 
other baryons with mesons is described in an analogous way. For example, 
for the system 7, 2, > it is easily found that 


Ais = V2g'Dhy5 Zin 3 (135.8) 


In conclusion we stress once more that the electromagnetic interaction of 
nucleons violates the invariance with respect to transformations of the 
isogroup SU(2), and the results formulated above are no longer valid. For 
this interaction the Hamiltonian density can be written on the basis of the 
same considerations as those which were used in obtaining (135.4). Taking 
into account that the operator of creation and destruction of photons is the 
vector potential A, we shall have 


Hin, = ENY ÔNA (135.9) 
where Ô is the nucleon charge operator given by formula (133.14). Hence 


Hin 


= teÑy, (+73) NA, (135.10) 


ie. the Hamiltonian density contains an isovector part (the term with 73) in 
addition to the isoscalar part. The presence of the former violates the con- 
servation of the total isospin 7, although its component T} is conserved. But 
the intensity of the electromagnetic interaction is much smaller than that of 
the strong interaction, so that electromagnetic corrections can frequently be 
neglected, being considered at most as a small perturbation. 

Some quantitative consequences of the hypothesis of charge independence 
of the strong interaction, which is equivalent to total isospin conservation, are 
presented in the next section. 








580 ELEMENTARY PARTICLES Ch. 15 
§ 136. The scattering of nucleons and 7-mesons 


We apply the isospin formalism to the analysis of the processes of scatter- 
ing of nucleons by nucleons and of z-mesons by nucleons. Generalization to 
the case of other hadrons presents no difficulty. 

Let several reactions 


a; +b; > ¢; +d; (136.1) 


be considered, all particles of the type a, b, c and d belonging to one and the 
same isomultiplet. For the wave functions of the initial and final states we 
shall make use of the Dirac notation |a;b;) and |c;d). The scattering amplitude 
f is proportional to the matrix element 


MO = (;d,\a;b;) (136.2) 


the square of the modulus of which defines the differential and, after integrat- 
ing over angles, total cross sections for the process. 

We first assume that the state of the particles before scattering has definite 
values of the isospin T and of its component 73, i.e. that its wave function 
is |7,73). If the part of the matrix element corresponding to Coulomb scat- 
tering is separated, then from charge independence it follows that in the 
reaction the isospin does not change: 


(T',T3|T,T3)=0 for T'#T, (136.3) 


i.e. the wave function of the final state is also a function of the type |7,73). 
Furthermore, the matrix element corresponding to the scattering due to the 
strong interaction cannot depend on the values of the component 73, but 
is defined by the isospin T (and by other quantum numbers), by momenta, 
by spins and so on. We denote 


(T,T3|T,T3)=M7) . (136.4) 


The validity of the isospin formalism as applied to the class of problems 
considered lies in the fact that the matrix element of any real process of the 
set (136.1) can be expressed in terms of a small number of (in most cases 
two) matrix elements M‘7) corresponding to the scattering in a definite 
isospin state. For this it suffices to expand the wave functions |a;b;) and 
lc;d) in terms of the wave functions |7,73), substitute these expansions into 
(136.2) and make use of formulae (136.3)—(136.4). Thus one can establish 
a number of relations between the cross sections for different processes 
corresponding to the same initial and the same final spatial-spin states of the 
particles involved in the scattering. 








§136 SCATTERING OF NUCLEONS AND 7-MESONS 581 


1. As the first example let us consider the scattering of protons by 
protons and of neutrons by protons. First of all, from (134.8) we express 
the wave functions of the initial and final states, i.e. of the systems pp or np, 
in terms of the basis functions of the isotriplet and isosinglet: 


pip) =1,4D, n'p = B ({1,0)—|0,0)) , lp'n">= Jz 0100.0). 
RT ER ath (136.5) 

Then for the process of scattering p' +p” > p’ + p” we find that 
MP) = (p'p"|p'p") = (+1 1,4) = MO). (136.6) 


In neutron—proton scattering the following two processes are possible: 
ordinary elastic scattering 


n+p’ >n' +p” 
and charge-exchange scattering 
n'+p”>p +n”. 
For the first of these 
Meas = (n'y In'p") = $ [((1,0|—€,01)(11,0—10,00] =4(MO+MO)] (136.7) 
and for the second 
Mchex = (p'n"In'p”) = 4 I1 ,01#0,01)(11,0)—10,0)] = 4 [MM-M]. (136.8) 


The elastic scattering cross sections are proportional to the squares of the 
moduli of the matrix elements, hence 





la 
sore mD, e ~ 1 MOMO] , 

ch.ex 
a ~ Ly) yO? | (136.9) 


Summing the last two expressions, we obtain the total neutron—proton 
scattering cross section, which is determined experimentally by the total 
number of protons and neutrons scattered at a given angle. Finally 
do PP) do (np) 
~ 1))2 ~l 1))2 42 2 
a MJE, da 2M? + 31MO |? . (136-10) 


Since the angular dependence of the matrix elements MC) and MO) can 
be essentially different, then, in spite of charge independence, the behaviour 
of the proton—proton and neutron—proton scattering cross sections as func- 








582 ELEMENTARY PARTICLES Ch: 15 


tions of the angular variable can be different. Experiment shows that this is 
indeed so. In the energy range of 300—500 MeV in the centre-of-mass system 
the first cross section is almost independent of the scattering angle, whereas 
the second has a minimum at 0 = 47, increasing sharply in the backward direc- 
tion and to a lesser degree in the forward direction. 

2. The example of the reaction of m-meson production with formation 


of a deuteron in nucleon—nucleon collisions is somewhat more interesting: 
ptpodtat, 
nt+tp>d+ no. 

Since the deuteron isospin is equal to zero (see the beginning of §135), then 
In*d) = |1,+1), {nd = |1,0). (136.11) 


Therefore 


MP) = (7*d\pp) = (1,+1|1,+1) =MO , 
Mp) = dnp) “5 141 ,01(11 0)-10.0)]= J MQ). (136.12) 


From this follows the relation between the cross sections: 
(pp) 
SOG (136.13) 
do™P) /dQ 


which has been confirmed experimentally. 
3. The scattering of charged m-mesons by protons is an even more inter- 


esting case: 
m+p>m tp, 
aM rA => GE aNg 
nm +p>7 +n. 


Denoting the matrix elements and cross sections referring to these processes 
by the symbol of the 7-meson in the final state and making use of formulae 


(134.15), we obtain 





§136 SCATTERING OF NUCLEONS AND 7-MESONS 583 


MO) = Gt pintp) = G ,+313,+3)= MĜ) , 
MO = Grpin-p) = (VEIG, 2126-31) X 

X (VII-I —4))] = LMG) + 2MG) , 
MO = rnin p) = [V3.8 11/54 41) X 

x VIR -DZE -D = 2 -L ue , 





hence 
{eV ae SNe sik 1 EN 2 (l 
IO, O= +20, o- phy 2 pay, (136.14) 


Eliminating the amplitudes f@) and fG), we have the relation 





PORTO V2fO (136.15) 
from which the so-called triangle relations follow: 
Wo Vo) < VA <JoH + Vo . (136.16) 


Under some additional assumptions regarding the properties of the ampli- 
tudes, more interesting relations can be obtained between the cross sections. 
Thus in the case fG) ~ 0 


0) : o0): o0 =9:1:2. (136.17) 
If it is assumed that f@) ~ f@), then 

o : o) : 06% =1:1:0. (136.18) 
Finally, if f@) ~ 0, then 


a) : o): 6%=0:2:1. (136.19) 


Experiment shows that for an energy of 120 MeV of the incident 7-meson 
the total cross sections are in the ratio 93:11:22 ~ 9:1:2, i.e. in this energy 
range the scattering in the state with isospin T=3 is dominating. For ener- 
gies above 200 MeV the amplitude fG) also begins to give a considerable 
contribution. 

There is also another, simpler method of obtaining the relations between 
the cross sections, which does not require knowledge of the Clebsch—Gordan 
coefficients and is especially useful in those cases where their calculation is 
for any reason difficult. It is called the method of invariant amplitudes, and 
will be demonstrated by the example of the scattering of charged 7-mesons 
by nucleons. 

Under the assumption of charge independence the isospin T does not 











584 ELEMENTARY PARTICLES Ch. 15 
change in scattering. This means that the total amplitude of the scattering of 
One isomultiplet by another must be an isoscalar. In our case it is constructed 
of the wave functions N; and a of the initial state and the wave functions M’ 
and a of the final state which transform according to the conjugate repre- 
SALON (we denote by a bar the functions describing the final state; at the 
same time the bar is the symbol of the conjugate representation). The 
matrices of these wave functions are of the form 


a (ee 


= 0 
ui =n 2 (136.20) 
ae Le TON? T 
7), Th -(° S ) 5 
t Aa —79 V2 
One can form from them the two independent isoscalars 
NIN; iN; kri. and Niwa Ng 
so that the amplitude is written in the form of a linear combination 
F=f NIN anh) +f Niin N) . (136.21) 


From (136.20) we have for the isospin part of the wave functions of the 
particles of the initial and final states 


for the proton p >N,=1, p >N!=1 

for the neutron n >Nr=1, n >N?=1 

for the m*-meson n+ >n? =1, at > 7} =] 

for the 7~-meson a >q} =i. 7 >it =] 

arie acon) Deale eee ee ee a= 
= 02 = Sl la a a 


(all other components are equal to zero). 
Substituting these wave functions into the amplitude (136.21), we obtain 


FOSS fO=f, +h, fO=_fo/V/2 , (136.22) 


where f} and f} are functions of spatial-spin variables, unknown but the same 
for all pion—nucleon processes. From (136.22) there follows, in particular, 
relation (136.15), and therefore also the triangle inequalities. It is known 
from experiment that in the region of high energies and small angles the cross 
section of the charge-exchange process m~ + p > n° +n is small compared to 


§137 THE UNITARY GROUP SU(3) 585 


elastic cross sections. Hence assuming fọ ~0O we obtain the approximate 
equality of the differential cross sections in the forward direction for the 
elastic scattering of 7* and 7~-mesons by protons: 


do do? 
dey lng aa 








(136.23) 


The amplitudes f; can be expressed in terms of the amplitudes f(D and vice 
versa. For this it suffices to compare relations (136.22) and (136.14), hence 


fy, =f; fy = -3f® +30. (136.24) 


In concluding this section we indicate a simple method of determining 
the number of independent invariant amplitudes. According to the rule of 
addition of angular momenta we find possible values of the isospin T of the 
initial and final states. By virtue of charge independence only transitions 
with conservation of T are possible, hence the number of independent 
amplitudes is defined by the number of values of the isospin which occur in 
both the initial and final states. In our example the isospin of the initial and 
final states are equal to 3 and }, for which there are two amplitudes: f@) and 
f@), or fy and fy. 


§ 137. The unitary group SU(3) and its representations 


In the preceding sections we studied some consequences of the isospin 
invariance of the strong interaction based on the group SU(2) and one most 
suitable for the description of the symmetry properties of nucleons and m- 
mesons. However, after the discovery of strange particles the framework of 
this group turned out to be too narrow, because its rank is equal to | and it 
gives only one conserved additive quantum number (the isospin component 
T3) by means of which the terms of a given isomultiplet are classified. By an 
additive quantum number is meant a quantity whose value for a certain 
system is equal to the sum of its values for the subsystems. In this sense, T is 
not an additive characteristic. At the same time there is at least one more 
characteristic of this type, the hypercharge Y (or strangeness), so that it is 
natural to try to group several isomultiplets with different hypercharges into 
one supermultiplet. For this a group of rank 2 is necessary, the mutually 
commuting generators of which give two simultaneously measurable con- 
served quantum numbers characterizing the terms of the supermultiplet, so 
that they can be identified with 73 and Y. On the other hand, from the 
requirement that the new theory contain the results of the old it follows that 








————e 


586 ELEMENTARY PARTICLES Ch. 15 


the isogroup must be a part (subgroup) of a new larger symmetry group. 
These requirements can most simply and naturally be satisfied if one postu- 
lates the approximate invariance of the strong interaction with respect to the 
group SU(3) which we shall henceforth call unitary (in the narrow sense of 
the word)*. The corresponding mathematical apparatus is very close to the 
formalism presented in §132 the results of which we shall frequently refer to. 

The group SU(3) is understood to be the set of all unitary and unimodular 
matrices of third order which correspond to linear transformations in a three- 
dimensional complex space conserving the quadratic form xt x = xixi =x]x] + 
+x3xX3 +x3x3 (the indices 7, j and so on now run over the values 1, 2, 3). The 
generators of this group are Hermitian square 3X3 matrices A, with trace 
zero: 


AL=AQ, Trà =0. (137.1) 


Nine conditions of hermiticity and 1 condition of equality to zero of the 
trace are imposed upon 18 real parameters of the complex 3X3 matrices, so 
that there are 8 independent matrices with the properties (137.1), which 
defines the dimension of the group SU(3). Among the matrices A, there are 
two mutually commuting ones (the rank of the group SU(3) is equal to 2) 
which can simultaneously be diagonalized. We choose the representation in 
which )3 and Ag are diagonal: 


010 0 -i0 1 00 001 
yioo], =i 00], 2,=/0 -1 0], %4=| 000], 
000 0 00 0 00 100/ (379) 
00 -i 000 00 0 10 0 
As =| 0 0 0], r%6=|001], 7=|00-i], =z 0 1 0 
io 0 010 0 i 0 00-2 


Hence it follows that if all parameters w, of the group SU(3), except for the 
first three, are set equal to zero, then we obtain the isogroup SU(2). 

The representations of the group SU(3) are constructed in a completely 
analogous way to §132. However, the mutually conjugated representations 


* Jt should be noted that in 1961—1964 the situation was not completely clear. 
since it was impossible to make an unambiguous choice between SU(3) and the so- 
called group G2; some preference even was given to the latter. The problem was finally 
solved in 1964 when the 9-hyperon was discovered (see § 138). 





§137 THE UNITARY GROUP SU(3) 587 


will now not be equivalent, so that the given irreducible representation (p,q) 
is defined by two numbers, p and q (rather than one number p+q). The 
definitions and relations (132.13)—(132.16) remain valid as before if 7, is 
replaced in them by A, and if it is assumed that the Roman indices run over 
values from 1 to 3, and Greek indices from 1 to 8. 

Let us find the number of independent components of the matrix Cae 

1 
which is symmetric separately in all subscripts and superscripts and which as 
zero trace over any pair of superscript and subscript, i.e. let us determine the 
dimension (p,q) of the irreducible representation (p,q). By virtue of sym- 
metry only components differing in one of the numbers p], P2, P3 (the 
numbers of ones, twos and threes among the subscripts) are different. Since 
P,+P2+P3 =p, then for a given py the number p can vary from O up to 
p—P ,, Which gives p—p,+1 different components. Now varying p} from 0 
to p, we obtain the total number of different components for fixed super- 
scripts: i 
p 
Np)= © (p-pı+1)=}(p+1Xp+2). 
pı=0 

Analogously, the number of different components for fixed subscripts is 
equal to V(q) = 3(q+1)(q+2), there are in all N(p)N(q)components. But they 
are still not independent because they are related by the conditions that the 
traces are equal to zero. By virtue of symmetry it is sufficient that the trace 
with respect to one of the pairs of superscript and subscript reduce to zero; 
this trace will be a matrix with p—1 subscripts and q—1 superscripts, having 
thus V(p—1)N(q—1) components. Thus N(p,q)= M(p) N(q) — N(p—1) N(q—1) 
and, finally, 


N(p,4)= 3 (pt )(q+1)(p+q+2) . (137.3) 
Let us enumerate the most important representations of the group SU(3): 
) (0,0)=1 unitary scalar or singlet (NV=1), 
y; andy! (1,0) =3and(0,1)=3 unitary spinors or triplets (V=3), 
iy and yg!  (2,0)=6and(0,2)=6 sextets (V=6), 
ol (1,1)=8 unitary vector or octet (N=8), 
Yijk and gk (3,0)= 10 and (0,3)=10 decuplets (V=10), 
y% and yik (2,1)=15 and(1,2)=15 15-plets (W=15), 
ok @,2)=27 (N=27), 


YN QHRYWN > 


and so on. A very convenient notation is given here, which immediately 











588 ELEMENTARY PARTICLES Ch. 15 


indicates the dimension of the irreducible representation. Attention should be 
drawn, for example, to the representations (2,0) and (1,1); the corresponding 
wave functions have the same number of indices (namely 2), but the dimen- 
sions of the multiplets are different (6 and 8 respectively). Such a situation 
cannot be encountered in the group SU(2) by virtue of the equivalence of its 
mutually conjugate representations. 

Let us now write down the decompositions of some direct products of 
representations into irreducible representations: 


(1,0)@(,1) = (0,0)6C,1) (137.4a) 
or 3@3= 168, 
(1,0)@(1 ,0)@C 0) = (0,0)GC , 1)NSU,1)GG,0) (137.4b) 
or 38383 = 1@8@810, 
(1,0)@(1 ,0)@(0,1)= (1,0)@C1 ,0)@(0,2)@(2, 1) (137.4c) 
or 3@3@3 = 30306615 , 
(1,1)@(1,1) = (0,0)8(1,1)S1,1)®8,0)8(0,3)8(2,2) (137.44) 
or 888 = 1@868G10810@27 , 
(1,1)@(,0) = (1, D@GB.0)E(2,2)6(3, 1) (137.4e) 
or 8@10 = 8910927935 . 


Let us prove, for example, decompositions (137.4a) and (137.4d). In the first 
case the function ix! contains the non-zero pace y;x' which is a scalar (re- 
presentation (0,0)); the function ox! — 15 lox* , which remains after the 
separation of the non-zero trace, transforms according to the irreducible re- 
presentation (1,1). The proof of formula (137.4d) is somewhat more com- 
plex. First of all, if we symmetrize the function elxk with respect to the 
subscripts and superscripts and separate non-zero traces, then we obtain re- 
presentation (2,2). Further, two different traces with respect to only one of 
the pairs of indices (we recall that gi =x} =0) after the separation of the 
non-zero traces with respect to the remaining pair of indices, i.e. the two 
functions 


glxi -ily and yix* Lok y/y! 


transform according to the two representations (1,1) and (1,1)’. The total 
trace ox} is a unitary scalar,so that we have the representation (0,0). Finally, 
on separating symmetric parts there arise two functions of the type 


. ey . - A - - da j . . P 
Wily =elxk + ehxi -vixi oxi and wih = pixe + ¥ixk —vkxt vex 





§137 THE UNITARY GROUP SU(3) 589 


containing 10 independent components each. It can be shown that the lower 
pair {ik} is equivalent to one upper index and vice versa, hence we obtain 
the representations (3,0) and (0,3). We can convince ourselves of the validity 
of this decomposition by comparing the dimensions on the left-hand and 
right-hand sides of formula (137.4d): 8 X 8 = 1+8+8+10+10+27 = 64. Other 
formulae (137.4) are proved in an analogous way. 

The group SU(3) has two invariant operators whose eigenvalues serve for 
the classification of irreducible representations. But we have already un- 
ambiguously characterized the representations also by two numbers p and q. 
Hence we shall not write down and analyse the invariant operators; we shall 
not need them in what follows. 

For the classification of the basis elements of a multiplet, which can be 


chosen to be definite combinations of the matrices on tq with only one 
wtp 

non-zero element, we shall make use of the eigenvalues ‘of the diagonal gen- 

erators ^3, Ag. From the explicit expressions of the type (132.15) for the 


generators A, it follows that they act on the functions gig with one non- 
zero element in the following way: Bona 


q 


AW fi flea DO Nae JiJq + : =) Aa) İs oi -Is—Usist1~Iq 
«lp is te ts—Lisist]tp = Ye Pisa (137 5) 


Let there be among p subscripts, pa indices 3 and p—p, indices | and 2, and 
among q superscripts, q3 indices 3 and q~—q3 indices | and 2. Since the 
matrix Ag gives 1/\/3 in application to each superscript and subscript 1, 2, 
and —2/,/3 in application to any index 3, then according to (137.5) the 
eigenvalues of the generators Ag are equal to 


Ag = V3[5(p-¢)—-13 143] - (137.6) 
Analogously, for the eigenvalues of the generator A3 we have 
A3 = (P,-P2—-41 +42) - (137.7) 


If no threes were present among the indices, i.e. if the group SU(2) were 
considered, then the relations p; +p =p and q,+q2 =q would be fulfilled. 
In this case formula (137.7) would go over into (132.25). 

Two quantum numbers A3, Ag are not sufficient, however, for the un- 
ambiguous classification of the basis elements of a given multiplet. mey a 
complemented, say, by the eigenvalue of the operator NEN +A3 + Na 
From (137.2) it follows that 








590 ELEMENTARY PARTICLES Ch. 15 
TOO | 
AEA +03 +03 3O 1 0 (137.8) 
0o OO 


and hence the matrix A2 does not commute with all the matrices A,- By virtue 
of the fact that the expressions for the generators are completely analogous 
to those given in § 132 (see (132.15)) this statement is valid also for A2. For 
this reason the situation with the eigenvalues of the operator A? is somewhat 
more complex, since there may correspond several eigenvalues of the operator 


A2 to the component gig referring to the eigenvalues A, = Ag = 0, ie. a 
p Hoi 8 g 3 8 
ESRD. 


peculiar degeneracy occurs. It is easily shown that 
jees ii-i 
iN = (p 479441 442 MP *P2*4) +4242), Fie (137.9) 


but this statement will be valid generally only if A3 #0 or Ag #0. In the case 
A3 = Ag=0 the factor in the right-hand side of (137.9) gives, however, the 
maximum eigenvalue of the operator A. 


§138. The eightfold way formalism and unitary multiplets 


Suppose that in nature there were a superstrong interaction of elementary 
particles invariant with respect to the group SU(3). Then the wave functions 
of hadrons would transform according to some of its irreducible representa- 
tions, i.e. they are the products of ordinary spatial-spin wave functions with 


the unitary matrices Oa As a result all hadrons would be distributed in 
ety 
unitary multiplets, aa eied by a pair of numbers p and q, spin, parity, 
baryonic number and other quantities not associated with the group SU(3) 
(see §133). Individual hadrons within a multiplet are classified by the eigen- 
values of the generators A3 and Ag and of the operator A2, which will be 
identified below with 73, Y and T2. 

If it is assumed that there is only the superstrong interaction, then these 
quantum numbers must be conserved, the transitions occurring asa result of 
the reactions being possible only within one unitary multiplet, as in the case 
of isospin invariance. Furthermore, quantum-mechanical degeneracy must 
result, and the masses of the particles contained in one unitary multiplet must 








§138 THE EIGHTFOLD WAY 591 


all be strictly the same. On the other hand, experiment shows that the masses 
of the particles having different hypercharge (for the same spin, parity and 
baryonic number) differ sharply: for example, the mass difference between 
the nucleon and =-hyperon amounts to about 30% of the mass of the latter. 
This means that the ordinary strong interaction must already essentially 
violate the SU(3) invariance, and to a much larger degree than the electro- 
magnetic interaction violates the isospin symmetry. But in the strong inter- 
action Y, T and T} are still conserved, and this allows one to obtain the 
transformation properties of the Hamiltonian of the interaction violating the 
symmetry with respect to unitary transformations from the group SU(3). 
This in its turn makes it possible to obtain definite relations between the 
masses of the particles contained in a given unitary multiplet. 

Instead of the generators A,, Az, Aj and Ag it is convenient to introduce 
the operators 


a 1 a a 


Ti =24,» T2 =3M., T3 = 3^3 (138.1) 
and 

eae 

we ST NB (138.2) 


It is natural to identify the operators i ity, N with the operators of the 
isospin components, and the operator 


F? =f] +73 +73 = 412 (138.3) 


with the operator of the isospin squared. From (137.7) and (137.9) it follows 
that the eigenvalues of 73 and T? are equal to (see however the remark at the 
end of §137) 


T3 = 3@,-P2 —4\ +4) (138.4) 
and 
T(T+1) = 41 +h +41 +42)(P +244 | +4242) . (138.5) 
The operator Y= Ag/V/3 can be identified with the hypercharge operator, 
so that, according to (137.6), 
=1(p-q)—p3 +q3 - (138.6) 


Such an identification is not unambiguous and corresponds to the so-called 
eightfold formalism or eightfold way proposed in 1961 by Gell-Mann and 
independently by Ne’eman. Its suitability is justified a posteriori, since within 
the framework of the eightfold way the real élementary particles are de- 
scribed in the best of all possible ways. Another choice of the operator Y is 
considered in §141. 








— 


—— EEE 


592 ELEMENTARY PARTICLES Ch. 15 


From (138.6) and the requirement that the hypercharge has an integer 
value it follows that it is necessary to consider only those representations of 
the group SU(3) for which the difference p~gq is a multiple of 3 


p-q=3n (138.7) 
i.e. the representations 
@O=H1, G.1)=8, G.0)=10, (,3)=10, (2,2)=27, (138.8) 


and so on. In view of the importance of the representation (1,1) =8 in the 
approach considered, it is called the eightfold formalism. Finally, we note 
that in correspondence with the Gell-Mann—Nishijima relation (§129, 133) 
we assume by definition that the charge operator is 


Ô= fs +4P=1A3 +57 As - (138.9) 


Let us now consider concrete unitary multiplets of hadrons. We turn first of 
all to the octet with the wave function y/, and investigate its content in iso- 
spin T, its component 73, hypercharge Y and electric charge Q. Making use of 
relations (138.3)—(138.6) and (138.9), one can immediately draw up table 5. 
In the last three columns are shown the stable particles, pseudoscalar and 


vector mesons and baryons with spin }, which have the corresponding quan- 








Table 5 
Component T T3 Y Q - Meson 07 Meson 17 Baryon 
yi 1,0* 0 0 0 n? n p wp ELA 
oe 1,0* 0 0 0 n?n p sw DOA 
yi 1 +] 0 +1 nt pt pt 
92 1 =i 0 -1 m P z- 
vi 3 +4 +1 +1 Kt K+ p 
2 3 -5 +1 Om Kk? K+*0 n 
re 4 -4 =í =i ie Ket = 
B ey 0, Ko co Se 
3 0 0 0 0 n wy A 











§138 THE EIGHTFOLD WAY 593 


tum numbers. It is remarkable that all stable pseudoscalar mesons, whose 

number is just 8, and all 8 stable baryons (except for 27 whose spin is equal 

to 2) are involved here. Hence it is natural to assume that these two groups of 

particles just form unitary octets. The matrices of their wave functions have 
_ the following form: 


20/2 + AVE xt p 
Bi =| X- —20/ V2 +A/V6 n 3 
= z0 -2A/vV/6 
(138.10) 
[W/Z +nh/6 mt K* 
Mi A 1 <n / V2 +n/ V6 K? - 
Ki KO —=2n/ V6 


The choice of just such coefficients is dictated by the requirement that the 
trace of the matrix be equal to zero and by the normalization considerations 
(see §133). The meson octet contains particles together with antiparticles. 
This is accounted for by the fact that there now remain no quantum numbers 
which are not involved in the group SU(3) by means of which one could 
distinguish, say, between K* and K* (see §133). For baryons, however, such 
a number exists (the baryonic number B), so that the corresponding anti- 
particles form an independent octet: 


LO /2+ APE X- 
Bi =| St -50/2 +A/ V6 =? . (138.11) 


p 7 20/6 


Lll 














594, ELEMENTARY PARTICLES Ch, 15 


Unitary multiplets are conveniently presented in diagrams called weight 
diagrams. In this case an orthogonal reference frame is chosen, the hyper- 
charge being plotted on an axis and the isospin component on the other axis. 
Thus, for example, for the baryon octet we have fig. V.39. The situation with 
vector mesons is somewhat more complex, because at a place analogous to 
that occupied by the n-meson or A-hyperon there are two pretenders, w and 
y. We shall discuss it at the end of this section. 

Let us now draw up the table of quantum numbers of the particles which 
can be contained in the decuplet whose wave function is Plijk]> table 6. 








Table 6 

Component 1 T3 Ya Q Baryon 3+ 
il 5 +3 +1 +2 A 1336 
9112 3 +3 +1 +1 At236 
9122 2 -3 +1 0 49036 
222 3 -3 +1 = 47236 
9113 1 +1 0, +1 8s 
123 1 0 0 0 29385 
223 1 -1 0 = ©7385 
133 5 t z’! 0 =1530 
233 5 -4 =N =l =1s30 
9333 0 0 -2 =i ) 





At the time when an analogous table was drawn up (1961—1962) the 
place at which the question mark is standing was unoccupied, because no 
particle with hypercharge —2 was known. Thus the eightfold way predicted 
the existence of a new hyperon with spin 3* and with the quantum numbers 
indicated in the table. Moreover, the mass of this particle (about 1680 MeV) 
was also known approximately (see §140). At the beginning of 1964 sucha 
particle was indeed discovered: this is the rather stable 922~-hyperon (see the 
table of elementary particles given in §129). This fact eliminated any doubt 
as to the validity of the eightfold way and, in general, of unitary symmetry, 
which is now as classical as isospin symmetry. 

The weight diagram for the decuplet is given in fig. V.40. 


§139 CONSEQUENCES OF STRICT UNITARY SYMMETRY 595 








Fig. V.40 


Formula (138.7) also allows for the existence of unitary singlets. Since 
in this case all generators reduce to zero, then for a unitary scalar particle 
T= T3 = Y =Q=0. However, among the multitude of resonance states there 
is up to now none whose wave function could be quite trustworthily con- 
sidered as a unitary scalar. 

That is why unitary singlets play a decisive role in resolving the difficulty 
mentioned above with vector mesons. It can be assumed that real w- and Y- 
mesons represent different superpositions of the unitary singlet state y’ and 
the octet state w’ analogous to the 7-meson. In the case of strict unitary 
symmetry such a mixing of components of multiplets of different nature is 
forbidden, but as a result of the violation of the symmetry by the real strong 
interaction it is no longer impossible. Thus vector mesons form a nonet and 
not an octet. Apropos of this we shall confine ourselves only to the remarks 
of general character made above, referring the reader to the literature. 

Let us now enumerate, without comment, some unitary multiplets in 
which resonances are distributed. It turns out that there exists a family of 
9 mesons 2* (it contains, in particular, the f0-meson mentioned in §129), 
which also form a nonet. The nonet of mesons 1+, octet of baryon resonances 
3- and octets of baryon resonances $+ and 3* are somewhat more doubtful. 
At present insufficient data are available for a final solution of the problem. 


§ 139. Some consequences of strict unitary symmetry 


In this section some physical consequences of the hypothesis of strict 
unitary symmetry for hadrons are described briefly. It should be stressed 
immediately that they cannot pretend to be in good agreement with experi- 
mental data, since the real strong interaction already violates to a consider- 
able degree the SU(3) invariance of the theory. A more realistic scheme is 
outlined in the next section. 








596 ELEMENTARY PARTICLES Ghats 

Let us consider first of all the interaction of stable baryons of spin }* 
(octet B}) with pseudoscalar mesons (octet M!}). In quantum field theory the 
baryon and meson functions are considered to be operators in the space of 
occupation numbers. Analogously to what was done in § 135, the interaction 
Hamiltonian density is to be constructed in the form of an invariant combina- 
tion of the functions Bi, B} and Mh,- We first of all form unitary scalars from 
these functions. Noting that the two octets 


Ri kp DK pi kp j 
BiBk —}6f B), B?" and BkBi—}skBMp, 


can be constructed from Bi and Bi we contract these matrices with the 
meson function Mj. Taking into account that kmi, = M} = 0 we arrive at the 
two unitary scalars Bik Mi, and BE BiM}, which exhaust the possible in- 
variants. Usually their sum and difference are chosen, so that the baryon— 
meson interaction is described by the following Hamiltonian density: 


a a l ei kos P > 
Hin = T [BiysB} -Bf y5 BİIM}, + 
l aak =k 5 A 

+z?) lBiysBf +B; 758] Mi (139.1) 


(the matrix ys is introduced on the basis of the same considerations as in 
§135) containing two independent coupling -constants gF) and gP), For 
definite reasons the first term of the Hamiltonian (139.1) is called F-coupling, 
and the second D-coupling. 

Making use of (138.10) and (138.11) we write the components of the 
baryon and meson matrices in terms of the wave functions of isomultiplets: 


B=- >A, Be =+- A, B32 =N,, Bz, 
v6 v6 


D 2 — P = l = B3 -= PALTAR 

Dg= gA Bae fo ee Ba Fas BR 

mM =- M = 7 +94 M3 =K MY, =K" 
3 aie ll? b Th Jo Sb > a A 3 A, 


(a, b=1, 2 are isospin indices). Substituting them into (139.1), we obtain 





§139 CONSEQUENCES OF STRICT UNITARY SYMMETRY 597 
A ee 1 = 
Him = — V2 Ene Eby; z$ tA EDE) naN Ys No + 
1 -i = l = — 
tA (gg) nE ys! + WE gn [Ays 22+22y5A] + 


- 1 > — 
+ V3g Eb 752a taya 8-8) nN YSN, — 


1 ES m IA 
- 773 Ba +8 ) nEg1s=* —V3gPnAysA + 


1 D) F) g Fan =b -L O) EF) paz b 
t EP K Eers +5 12) K pr + 

R O pnb L P)y Ken? 
EO BO) KN 7525 + Ba KAZ AySNy + 

L Ge)-g®)) K, Ays z? +e (3gF-g0)) KIZ yA — 
+ NE (3g g ) K,AYs= + WB (3g g )K ZaYs A 


1 z = 1 eas 
-a3 Ge K N ysA — 573 Be PKIAYSN,. (139.2) 


| Thus 12 constants of the couplings EEr, NNT, EBAn, Z2n, EEn, NNn, EEN, 
| AAn, ZZK, NEK, AZK and NAK are expressed in terms of only two param- 
eters gò and gP). Each term contained in (139.2) is invariant with respect 
to isospin transformations, and can be written in explicit form (see §135), as 
a result of which 64 terms arise, each corresponding to the interaction of 
actual baryons with a meson. 
Let us now consider how the relations between the cross sections for dif- 
ferent processes are obtained in unitary invariant theory. Let there be several 
reactions 


| 
| 
| a;+b;>¢;+d;, 


the particles a;, b;, c; and d; belonging respectively to the unitary multiplets 
(p®,q@), (p),q®), (p©,q©) and @@,q@). In accordance with the gen- 
eral scheme described in §136, it is necessary in order to obtain the relations 
between the amplitudes of the reactions mentioned to proceed in the follow- 
ing way. 

1. We decompose the direct products of the representations 


(Pp ,q@)@p),q®) and pO qO PD, qd) 


into irreducible representations. 








598 ELEMENTARY PARTICLES Ch. 15 


2. We write down the independent amplitudes corresponding to the 
transition from a certain multiplet of the first decomposition to the same 
multiplet of the second decomposition. 

3. Making use of the table of Clebsch—Gordan coefficients of the group 
SU(3) we expand the wave functions |a;b; and |c;d) of the initial and final 
states in terms of the basis functions of the multiplets involved in the corre- 
sponding expansions. These functions are defined by the type of representa- 
tion and by the eigenvalues of the isospin operator, isospin component oper- 
ator and hypercharge operator, so that they should be written in the form 
Ip,4;T,T3, Y). 

4. We substitute the expansions of wave functions into the matrix ele- 
ment of the transition (¢;d;la;b;), as a result of which it turns out to be ex- 
pressed in terms of a relatively small number of matrix elements of transi- 
tions between multiplets of the same type as in item 2. 

As an example let us consider the scattering of pseudoscalar mesons by 
stable baryons of spin } (it can easily be calculated that there are in all 27 
such processes). In this case all the particles of both initial and final states 
belong to octets. Making use of the decomposition (137.4d) we conclude that 
there are 8 independent amplitudes corresponding to the transitions 


1>1, 10710, 10710, 27-27, 


Fo 8 8, 8-8’, 8'>8. 


However, making use of the invariance of theory with respect to time reversal, 
it can be shown that the last two amplitudes are expressed in terms of each 
other, so that there are 7 independent amplitudes in terms of which the 27 
amplitudes of real processes are expressed. 

In the case of the scattering of pseudoscalar mesons by $* baryons with 
the production of a pseudoscalar meson and a baryon resonance 3*, for 
example 


nt +p>nt+Att 
we have respectively for the initial and final states the decomposition 


(137.4d) and (137.4e), so that the amplitudes of the real processes are ex- 
pressed in terms of only 4 independent transition amplitudes 


88, 8>8, 10>10, 27> 27. 


However, the tables of Clebsch—Gordan coefficients of the group SU(3) are 
very cumbersome, and in each actual case it is much more convenient to 
make use of the method of invariant amplitudes described at the end of 





§139 CONSEQUENCES OF STRICT UNITARY SYMMETRY 599 


§136. Let us again consider the scattering of pseudoscalar mesons by it 
baryons. The total scattering amplitude, which in our case must be con- 
structed from the wave functions Bi and M} of the initial state and the 
wave functions B/ and M}, of the final state, is a unitary invariant of the most 
general form. It can be written in the form of the following combination of 
nine scalar terms: 


S= f, (BBMM) + fa (BM)MB) + f3(BM)(BM) + f4 (BBMM) + fs (BBMM) + 
+ f¢(BMMB) + f7 (BMMB) + fg(BMBM) + fo(BMBM) . (139.3) 


For brevity, the traces of the products of matrices of the type B and M are 
here denoted by parentheses, so that, for example, 


(BB)(MM) = BİBiM M$ , 
(BMBM) = BiME BM; 
and so on. The quantities f, involved in the amplitude (139.3) are functions 
of spatial-spin variables, which are unknown but the same for all meson— 
baryon processes. 
We already know that the processes considered are described by eight 
independent amplitudes (without taking into account the invariance with 


respect to time reversal). Hence a relation must exist between the nine 
invariants involved in (139.3). Indeed, it does: 


(BB)(MM) + (BM)(MB) + (BM)(BM) = (139.4) 
= (BBMM) + (BBMM) + (BMMB) + (BMMB) + (BMBM) + (BMBM) 

as can easily be seen by direct calculation. In general, the establishment of 
relations of the type (139.4) in practice turns out to be difficult, but this 
is not important, because if all the 9 terms are considered to be formally 
independent, the final results will automatically involve the functions f, in 
the form of just eight independent combinations. 

As is known, under the operation of time reversal every wave function 


goes over into its complex conjugate, the initial and final states being ex- 
changed*; in our case 


Bİ, Bl, Mi, Mi. > Bl, Bi, Mi, Mj, . (139.5) 
When such an exchange is made the first seven terms of (139.3) do not 
change, while the last two go over into each other, so that from the invariance 


* See, for example, L.D.Landau and E.M.Lifshitz, Quantum mechanics (Pergamon 
Press, Oxford, 1965). 








600 ELEMENTARY PARTICLES Ch. 15 


with respect to time reversal it follows that fg =fg; taking into account 
relation (139.4) they can be set equal to zero and one can operate with only 
the first seven invariant amplitudes. 

Now, by means of a procedure quite analogous to that described at the 
end of §136 the amplitudes of all the 27 real processes can be expressed in 
terms of 7 independent functions f, , for example 


SQr-prK*Z>) =f; 
AK p>K*20) =f; 
SEKK Pp) =f, +h thy tfe (139.6) 
fa pomp) =f, tfe 
f(K-p>m-2*) =f, +f, 
and so on. On eliminating the functions f, we shall have the relations between 
the amplitudes of different processes. Thus from (139.6) it follows that 


S(-p>K* =~) = f(K°p>K*=°) (139.7) 
and 

S(K p?K p) — fr p>m p) = f(K- pon =*) (139.8) 
and hence for the cross sections we obtain 

o(m-p>K* =~) = o(K9p>K*=°) (139.9) 
and 





Vom p>n p) —Vo(K-p>K~p) < Vo(K"p>r-2*). (139.10) 


The cross section on the left-hand side of equality (139.9) has a large 
value, while the cross section on the right-hand side is small, so that in 
analysing these processes it is necessary to take into account the violation of 
unitary symmetry by the strong interactions. On the other hand, inequality 
(139.10) is fulfilled in the entire energy range. Moreover, for large energies, 
when the cross section o(K~p>m~2*) is very small, (139.10) goes over into 
the equality 


o(n-p>mp) ~ o(K-p>K5p) (139.10a) 


which is in good agreement with experimental data. 

It should be noted that the interpretation of the theoretical predictions of 
strict unitary symmetry and their comparison with experiment is a rather 
complex problem, since it is necessary to state beforehand the energy range 


§140 VIOLATED UNITARY SYMMETRY 601 


in which the violation of SU(3) invariance can be disregarded. Furthermore, 
the cross sections for different processes involve kinematical factors which 
are different, because they contain the masses of the particles taking part in 
the reactions. On the other hand, the relations written above assume equality 
of the masses of the particles contained in the same multiplet. Hence from 
the relations between amplitudes one actually obtains relations not between 
cross sections but between their ratios to kinematical factors, so that one has 
to take from experiment certain ‘corrected’ values of the cross sections. 


§140. Some aspects of violated unitary symmetry 


We have above frequently stressed that the real strong interaction violates 
strict unitary symmetry but, if electromagnetism is disregarded, the isospin 
T and hypercharge Y are still conserved. Therefore the interaction Hamil- 
tonian density cannot be a unitary scalar, but must be an isoscalar (7=0) and 
must correspond to zero hypercharge (Y=0). Thus among the components of 
unitary multiplets one has to find those for which T= Y = 0. From formula 
(138.5) for the maximum value of the isospin, it is seen that all the indices 
of these components must be threes, so that the equalities P3=pandq3=q 
are fulfilled. Then on the basis of expression (138.6) for hypercharge we 
arrive immediately at the condition p —q = 0. Thus zero isospin and zgro 
hypercharge are possessed only by the components of symmetric multiplets 
of the type (p,p), i.e. of the multiplets (1,1) = 8, (2,2) = 27 and so on, each 
index of which is equal to 3. Hence the strong interaction Hamiltonian 
density must have the following general structure 


Ên = SoH +8, H3 +8633 +... . (140.1) 


As soon as the terms violating unitary symmetry are introduced the earlier 
quantum-mechanical degeneracy will be removed, i.e. the masses of the 
particles belonging to the same multiplet must split. For the derivation of 
mass formulae we introduce the operator M whose eigenvalues are equal to 
the masses of isomultiplets with definite hypercharge (in consequence of 
isospin invariance they do not depend on the component 73): 


Mip.a;T, Y) = mip,4;T,Y) . (140.2) 


Under the assumption of strict unitary symmetry the mass operator will be 
an invariant of the group SU(3), and its eigenvalues in a given multiplet will 
be the same. It is natural to assume that, when unitary symmetry is violated, 








602 ELEMENTARY PARTICLES Ghads 


M acquires a structure analogous to (140.1), it being assumed that gy <8]: 
hence 


M=Mo +M} . (140.3) 
It is convenient to introduce the matrix M of the mass operator 
M=p,q:T".Y'\Mip.q;T.Y) (140.4) 


which from (140.2) is diagonal; its elements define the masses of the terms 
of the multiplet. Te matrix represents a certain bilinear combination ai the 
wave functions g ki of a given multiplet (p,q) and the wave function ¢ Fe =a 
-ip ETA 
of the conjugate PRN iel (q.p). From (140.3) it follows that it must contain 
an invariant part and a term corresponding to the 3—3 component of the 
octet. From the wave functions mentioned a scalar can be constructed: 
alt Ji-j 
Pga 


Mo. ‘ 
TimIq tip 


and two octets 


Si ip hig eip fied 
ayy. ! Foi — 44,64 pi Py. vi 
h-Jq 1--Ip—]1i «jq iip 
and 
Fi i -iq—1i op 
On A R U = 4a6h daa 
teJq—-1i iip «jq iip 
hence 


= 3 — art) iy .Jg—13 
M=a op: ip g1- -İq tajp” Ips ji- -Jq +a! 'p J1 lq 1 
ji=jq ti-tp iq eip] 3 İijq—13 t= tp (140.5) 


where ay =m —4(a, +a), and mọ, 4, and a, are certain parameters. Thus 
in the general case the mass formula contains at the most 3 parameters (for 
multiplets of the type (37,0) and (0,37) only two, since in these cases there 
are only superscripts or subscripts), of which mg corresponds to the mass of 
the members of the multiplet under the assumption of strict unitary sym- 
metry. We note that, according to Feynman, for boson multiplets it is nec- 
essary to consider the matrix M? instead of the matrix M, since bosons, in 
contrast to fermions, obey an equation of second order. 
For the octet of baryons of spin $* formula (140.5) goes over into 


M= ag BiB} +a, B? BS +a B5B} . (140.6) 


§140 VIOLATED UNITARY SYMMETRY 603 
Making use of (138.10) and calculating the matrix elements corresponding to 
each baryon, we obtain 

My = 4o +47, Mz =ag +a), 

Ms =a, mp =a + 3(a, ta) . 
From this follows the Gell-Mann mass formula 

3m, tmy = 2(my tN) (140.7) 


which is in good agreement with experiment: the right-hand side adds up to 
4518 MeV, while the left-hand side adds up to 4535 MeV, i.e. the accuracy is 
to within about 0.4%. Taking into account that in the pseudoscalar meson 
octet there is a K-meson in place of the nucleon, and in place of the =- 
hyperon there is a K-meson, the masses of the K and K being equal, we have 
from an analogue ena 6) 


3m? + m2 = 4mz k- (140.7’) 


This relation agrees with experiment to within 5%. In the case of vector 
mesons the situation is made more complex by the presence of w—y mixing 
(see § 138), and we shall not discuss it. 

For the decuplet of particles of spin 3* formula (140.5) goes over into 


M = boBY* Bip +b, BBs jz - (140.8) 
Hence, making use of the results of §138, we obtain 

my =bo, Myx = by + $b, 

mz = bo + $b, nioa b0 tb. 
From these formulae the interval rule follows: 

Myx — Ma = Mex -Myx = MQ — Mz. (140.9) 


For the first equality we have 147 and 145 MeV (accuracy of the order of 
1%), and from the second equality the mass of the 22-hyperon can be pre- 
dicted to be 1676 MeV, which was brilliantly confirmed by experiment: 
me? = 1675 MeV. 

Okubo has derived a general mass formula valid for all unitary multiplets: 


m?-Bl = a(p,q) + b(p,q)BY +c(p,4)[T(T+1)-}Y?]. (140.10) 


In the case of violated SU(3) symmetry the amplitudes of the scattering 
of one unitary multiplet by another will no longer be invariant; in addition to 
the scalar part they will contain the 3—3 components of the octet. As an 








604 ELEMENTARY PARTICLES Ch. 15 


example we write down the first term of formula (139.3), which now assumes 
the form 


fi = fı o(8B)MM) +f, BB MM) +f 2B5B? MM) + 
+f 3(BB) M} M} +f, 4(BB) MM} . (140.11) 


In view of the enormous number of independent arbitrary parameters arising 
in such a scheme, its heuristic value decreases sharply, since the physical 
information that it provides becomes very small. 

However, if one adopts the reasonable and experimentally confirmed 
hypothesis of Okun and Pomeranchuk, that asymptotically at very high 
energies the cross sections for charge exchange inelastic processes are negli- 
gibly small in comparison with the cross sections for ordinary elastic scatter- 
ing, then in this energy range only the first term will remain in (139.3). Then 
under the assumption of strict unitary symmetry we obtain the asymptotic 
equality of the amplitudes of the elastic scattering of the z-, n-, K- and K- 
mesons by baryons: 


fy =f = f= R- (140.12) 


In the case of violated symmetry it is necessary to make use of formula 
(140.11); hence follows the relation between the amplitudes similar to the 
mass relation: 


fa + 3Ly =20K+IR) - (140.13) 


In concluding this section we note that one can by the same procedure 
investigate the isospin symmetry violated by the electromagnetic interaction 
using the Hamiltonian density (135.10). We leave it to the reader to obtain 
on his own, as a useful excercise, the following mass formulae for the iso- 
triplet © and isoquartet A: 


myo = 3 (Mg++my-) (140.14) 
and 


ma» =m,- = 3(mM,+—M,0) (140.15) 


§141. Composite models in the unitary symmetry scheme. Quarks 


At the beginning of §134 we pointed out that the 7-meson (and generally 
speaking, also all other non-strange particles) can be conceived of as a particle 








§141 COMPOSITE MODELS. QUARKS 605 


consisting of a nucleon and an antinucleon. It is natural to try to formulate 
an analogous ‘minimum’ model in which all hadrons are constructed from a 
small number of some particles said, in a certain sense, to be fundamental. 
For this it would be necessary to add to the nucleon, which is a carrier of a 
baryonic number and an isospin (and this means also an electric charge), at 
least one particle which possesses strangeness. The most economic model of 
such a type was proposed in 1956 by Sakata, who chose as fundamental 
particles p, n, A and Ð, ñ, A and assumed that there is an attraction between 
any fundamental baryon and antibaryon, and a repulsion between two 
baryons or antibaryons. The wave functions of the hadrons known at that 
time were constructed as follows: 


mm =pn, n =pn, n0 = wen) ; 

Ke =p, K9=nA, K-=pA, K’ =A; 

Dt =pnA=T*A, D-=pnA=7-A, z0 =g PP-n7)A = 70A ; 
z- =pAA=KA, z0 =ñAA = KA. (141.1) 


This model was developed by Markov, Okun and others, and made it pos- 
sible to obtain a large number of interesting physical results. 

It turns out that the Sakata model fits the scheme of unitary symmetry 
very well, if the mass difference between the nucleon and A-hyperon is 
neglected. For this it is sufficient to assume that these particles form the 
triplet (1,0), and the corresponding antiparticles the conjugate triplet (0,1): 


p 
S;=| n |, SEENE (141.2) 
A 


The isospin component operator (138.1) for p, n and A gives the correct 
values +}, —} and O, but the hypercharge operator (138.2) must be modified 
in such a way that instead of leading to fractional values it leads to Y= +1, 
+1, 0, respectively. We assume by definition 


P= ^s + 38i, (141.3) 


so that instead of (138.6) we shall now have 


= 5(p—4)-P3 +43 +B. (141.4) 








606 ELEMENTARY PARTICLES Ch. 15 


On the basis of the requirement that the hypercharge be an integer, and 
taking into account that for mesons B=0, we arrive at the old relation 
p—q=3n, i.e. in the Sakata model these particles must also fill unitary 
singlets, octuplets and so on. But for baryons B= 1, and instead of (138.7) 
we obtain 


p-—q=3n-2 (141.5) 
i.e. baryons must fill unitary triplets (1,0) = 3, sextets (0,2)= 6, 15-plets 
(2,1)=15 and so on. Making use of formulae (138.5) and (141.4. ve 
immediately find the values of isospin and hypercharge for the componer*s 
of these multiplets (see table 7). 








Table 7 
Components yi T Components Y T 
T 3 
a 1 2 Yab 2 1 
1 
P3 0 0 230 1 5 
33 c lic 3 3 
y 2 0 ab + 3155%3b+5 $%3a) l 2 
eo 1 : "33 0 0 
\ 
1 3 
CM 0 1 vba + 280033 0 1 
(a,b=1,2) 933 -1 } 





The distribution of hadrons over the multiplets mentioned corresponds to 
their wave functions (141.1). Indeed, mesons are made up of a ‘sakaton’ S; 
and the ‘antisakaton’ S‘, and their wave functions transform according to the 
direct product 3@3 which in its decomposition (137.4a) contains just the 
representations | and 8. Baryons are made up of two sakatons and an anti- 
sakaton, and the decomposition (137.4c) of the direct product 3@3®3 in- 
cludes just the necessary representations 3, 6 and 15. 

It is seen from the table that the Y-hyperon must be placed at least in a 
sextet which includes a nucleon-like particle and a particle with Y = +2, 
T=0 which must have spin }*. These particles have up to now not been 
discovered, although there are no prohibitions imposed upon their existence. 
The =-hyperon must be contained in a 15-plet in which a large number of 
unoccupied places remains, and the {2-hyperon cannot be included in any 
of the lower multiplets. Thus the classification of hadrons based on the 
Sakata model is much less satisfactory than in the eightfold way formalism. 


OO 


§141 COMPOSITE MODELS. QUARKS 607 


Furthermore, it leads to a number of conclusions contradicting experiment: 
for example, in the Sakata model the observed process 


is forbidden. 

Wishing to preserve all the advantages of composite models on the one 
hand, and those of the eightfold way formalism on the other hand, Gell-Mann 
and independently Zweig in 1964 proposed to renounce the modification of 
the hypercharge operator (138.2) and at the same time to assume that there is 
a unitary triplet 


hs 


an , qi = @ 1.92.43) (141.6) 


q3 


of particles possessing very unusual properties. From (138.4)—(138.6) and 
(138.9) it follows that they have the quantum numbers given in table 8 (the 
baryon number is by definition equal to 4) (all quantum numbers of the anti- 
particles, except for 7, have the opposite sign), i.e. the electric charge, 
baryonic number and hypercharge of these particles are fractional numbers. 
Their name, quarks (something incomprehensible and mystical from one of 
the novels of the Irish writer J Joyce), is due to just these properties. 





Table 8 
| 
Particle Q T T3 S B Ya 
1 
a enpad qpaenerive ae 
1 1 1 1 1 
a2 z3 2 2 0 3 $3 
a3 eee cee, 


Mesons are made up of a quark and an antiquark, and from (137.4a) are 
distributed over unitary singlets and octets. If the pair qq is in a !Sp-state, 
then its total spin is equal to zero and its parity is odd, and we obtain 
pseudoscalar mesons with the wave functions 








608 ELEMENTARY PARTICLES Ch. 15 


li Pires = LNB 
™ =4)4, T =4 142» m9 = 5 (4191-4242) ; 
K*=q1q3. K? = 4943. K~ =@q93 ; K? = nq; ; 


l (141.7) 
A ye Canta 24373); 


1 = = n 
x0 =J 4141 4429279373) 


where X0 is a resonance in the mn system with a mass of 960 MeV, with 
T= Y =0, and which is a unitary singlet. If the pair qq is in a 38) -state, then 
its spin is equal to 17, and we arrive at the vector mesons p, K*, w’ and gy’ 
whose wave functions are constructed analogously to (141.7). Since for the 
quark B=4, a system of three such particles will have a baryonic number 
equal to | and, hence it is natural to identify it with a baryon (we recall that 
in the Sakata model baryons are made up of two particles and an antiparticle). 
The wave function of the system qqq transforms according to the represen- 
tation 3®3@3 and therefore it follows from the decomposition (137.4b) that 
in the quark model, as well as in the eightfold way formalism, baryons fill 
unitary singlets, octets and decuplets. If the spins of two quarks are parallel, 
then there are 9 i+ states, of which one is a unitary singlet, while the 8 re- 
maining states belong to a unitary octet. If the spins of all three quarks are 
parallel, then we obtain 10 3* states forming a decuplet. 

We shall not consider the dynamic consequences of the quark model and 
some of its inherent difficulties, but refer the reader to the corresponding 
literature*. We shall dwell only briefly on the, problem of the reality of the 
existence of these unusual particles. If quarks indeed exist, then their world 
must be almost independent of the ordinary world: from the fact that their 
charge is fractional it follows that the lightest of the quarks must be 
absolutely stable; they can be produced only in the form of quark—antiquark 
pairs, for example as a result of the bombardment of ordinary matter by 
cosmic rays. Therefore in the course of time the total number of quarks 
contained in the Earth’s crust and in the waters of the oceans must increase 
progressively. However, numerous attempts to find these ‘relic’ quarks by 
means of precision apparatus have not yet been successful. Nor has anyone 
succeeded in discovering them in experiments with accelerators; only a lower 


* See, for example, the reviews of E.M.Levin and L.L.Frankfurt in Usp. Fiz. Nauk 94 
(1968) 243, and a rather popular review article of Ya.B.Zeldovich in Soviet Phys. Usp. 8 
(1965) 489. 


§142 GENERAL APPRAISAL OF UNITARY SYMMETRY 609 


limit for their mass has been established: Mg > 5 GeV (5X103 MeV). This 
points to the fact that if hadrons are actually made up of quarks, then their 
binding energy must be colossal. In this situation an ever increasing number 
of physicists (including Gell-Mann himself) are beginning to be inclined to the 
idea that even if quarks do exist they cannot be in a free state, but are similar 
to quasi-particles, for example to phonons, in a solid. Some outstanding 
scientists (Heisenberg, Chew and others) disapprove of the hypothesis of 
quarks. 

But in spite of everything the quark model is very attractive and even at 
the worst it is a very convenient mathematical tool for the formulation of 
unitary symmetry. The near future should give answer to one of the cardinal 
questions of the contemporary physics of elementary particles: are quarks 
real, and if so, in what sense, or are they a purely mathematical fiction? 


§ 142. General appraisal of unitary symmetry 


From the contents of the preceding sections it is seen that, owing to the 
hypothesis of the approximate invariance of the strong interaction with re- 
spect to the group SU(3), the physics of elementary particles has recently 
made much progress. The successes of unitary symmetry are numerous and 
impressive. 

1. All stable hadrons and low-lying resonances are distributed over 
unitary multiplets whose members have one and the same spin, parity and 
baryonic number: octet of 07 mesons, octet of it baryons, decuplet 3t, 
nonets of 17 and 2* meson resonances and others. No particle which in 
principle could not be placed in one of the unitary multiplets of not too 
high dimension has been discovered. 

2. The quark model, which makes it possible to construct all hadrons 
from three fundamental particles and their antiparticles, is very attractive. 

3. Different mass formulae have been obtained for isomultiplets within 
one unitary multiplet, which are in very good agreement with empirical data. 

4. A number of relations between the coupling constants of baryons 
with mesons have been established, which for the most part need experi- 
mental verification. 

5. The relations between the cross sections for different processes have 
been derived; when some subsidiary facts are accurately taken into account, 
none of them is in sharp disagreement with experiment. 

6. Taking into account the electromagnetic interaction, mass formulae 








610 ELEMENTARY PARTICLES Ch. 15 


have been obtained for individual members of the isomultiplets constituting a 
unitary multiplet. For example, on the basis of the theoretical relation 


Gnyp) — (zo —M=-) = My-—My+ (142.1) 


the sign of the mass difference between the Z0 and =~-hyperons was pre- 
dicted. This prediction was subsequently confirmed experimentally. 

7. The relations between the magnetic moments of the baryons belonging 
to the same unitary multiplet were derived. In particular, the following 
equalities were shown to hold for the members of the octuplet 5*: 


Mp =Hy+ » Mz-=Hy-, My = 2M, =—2Hyo. (142.2) 


8. Finally, Cabbibo succeeded in including the weak interaction also in 
the scheme of unitary symmetry, by which this theory acquired -a certain 
harmony and completeness. 

The first five points were discussed in §138—141. We shall not dwell on 
the remaining points, but refer the reader to the literature*. 

On the other hand, the hypothesis of unitary symmetry has a number of 
essential shortcomings. 

1. First of all, the group theoretical scheme does not contain any ele- 
ments of dynamics, and it must only be a part of a theory of elementary 
particles which is still to come. 

2. The question of the nature of unitary symmetry and its violation 
remains open. In connection with this different points of view have been put 
forward: 

(a) Unitary symmetry is a fundamental property of the strong interaction, 
and is inherent to it in the same way as, say, its invariance with respect to 
transformations of the Lorentz group. In this case it is violated either by a 
moderately strong interaction whose coupling constant is of the order of 0.1 
(we recall that the coupling constant of the strong interaction is equal to 
about 14), or owing to its interaction with the vacuum state which may have 
a complex structure (the so-called spontaneous violation of symmetry). 

(b) Unitary symmetry is approximate in its very nature, ‘because of a complex 
concurrence of different factors’**. In this case, one need not raise the 
question of the nature of the violation of SU(3) invariance, and the require- 


* See, for example, the monograph of Nguyen van Hieu, Lektsii po teorii unitarnoi 
simmetrii elementarnikh chastits (Lectures on the theory of elementary particles) 
(Atomizdat, Moscow, 1967). 

** G.Chew, The analytic S-matrix (Benjamin, New York, 1966). 


Or LK ——————_— 


antares acacia 


§ 142 GENERAL APPRAISAL OF UNITARY SYMMETRY 611 


ment that a unitary triplet of particles (quarks) necessarily exists makes no 
sense. 

(c) There exists a strictly unitary symmetric superstrong interaction, and 
SU(3) invariance is violated as a result of switching on a real strong inter- 
action. Such a situation is analogous to that in the case of isospin symmetry. 

3. The quark model in its initial version described in §141 contains a 
number of logical difficulties. One of these lies in the fact that in trying to 
construct hadrons from three quarks the spatial-spin part of the wave func- 
tion turns out to be antisymmetric with respect to permutation of the quarks, 
which is unusual for the lowest state. In trying to overcome this difficulty 
the quark was assigned a new quantum number, which is equivalent to the 
consideration of not three but nine quarks, whence the harmony and elegance 
of the model are lost. There were even propositions to assume that quarks do 
not obey Fermi—Dirac statistics but a certain new type (so-called para- 
statistics) in which occupation numbers may take on, say, the values O, 1 
and 2. 

4. The unitary symmetry scheme has a number of other shortcomings, 
which are notso fundamental but are still important. We shall only enumerate 
them: 

(a) The situation concerning the distribution of higher resonance.states over 
multiplets is not quite clear. For example, the problem as to where the 
particle Aj,4g5, having spin 4-, is to be placed has not been elucidated. 
(b) The problem of w—y mixing has not been finally solved. 

(c) The question remains open as to why the mass formula for baryons in- 
volves the mass itself, while the mass formula for mesons involves the square 
of the mass. 

(d) There is contradiction between the high accuracy of the mass formula for 
the decuplet 3* and the poor relations for the probabilities of decay of these 
resonances, and so on. 

It should also be noted that for a number of reasons the framework of the 
group SU(3) turns out to be too narrow: 

1. It does not involve the baryonic number. 

2. The parameters of w—y mixing are not predicted but are introduced into 
the theory from outside. 

3. There are a number of intermultiplet relations pointing to the correlation 
of unitary quantum numbers with ordinary spin. Thus, for example, the 
parameters contained in the Okubo formula (140.10) for the octet and 
decuplet of baryons turn out to be almost equal; the following mass formula 
is valid 








a S 


612 ELEMENTARY PARTICLES Ch. 15 
m2» —m2 = m2 — m2 (142.3 
K p K n 2.3) 
and so on. 


4. Some physicists are not content with the fact that the charge of quarks is 
a fractional number, which calls for a correction of the Gell-Mann—Nishijima 
relation by introducing into it a new quantum number, and this means in- 
creasing the rank of the basic symmetry group. 

In trying to partially overcome the shortcomings mentioned a theoretical 
scheme was formulated invariant with respect to the group SU(6), which 
describes at the same time the unitary symmetry and spin properties of 
particles. Pseudoscalar and vector mesons fill the 35-dimensional multiplet of 
this group (every 07 meson has one spin state, while the 17 meson has three 
spin states, so that if in addition a unitary scalar 17 particle is introduced, 
then we shall obtain in all 8X1 + 8X3 + 1X3 = 35 members of the multiplet), 
while 4+ and 3* baryons fill the 56-plet (8X2+10X4=56). The most im- 
pressive result of the group SU(6) is the formula for the ratio of the magnetic 
moments of the proton and neutron: 


n/p =- (142.4) 


(the experimental value being —0.68). But the group SU(6) is essentially non- 
relativistic and is unsuitable, for example, for a description of the processes 
of scattering of particles. However, in the attempt to make it relativistic 
insuperable difficulties have arisen associated with the probabilistic interpre- 
tation of the corresponding quantum-mechanical scheme. 

The trend of further development of the theory is to be indicated by 
experiment. 








SUBJECT INDEX 


Absorption, 387 

— coefficient, 389 

— cross section, 389 

— of light, 442 

— of neutrons by nuclei, 393 
Adiabatic approximation, 229 

— operator, non-, 231 

Adjoint operator, 65 

Airy function, 54 

Amplitude of a field, 439 

Angular momentum, addition of, 204 
— — conservation law, total, 112, 249 
— —, motion of particle with, 130 
of atom, 293 

of N-particle system, 250 
operator, 101, 199 

—, matrix elements of, 200 
projection, 202 

— —, total, 248 

Annihilation operator, 198, 425 
Anticommuting operators, 62 
Antiparticle, 497 

Antisymmetric wave function, 256 
Associative law, 170 

Atom, electron density in, 289 

— in magnetic field, 308 

= ordering of energy levels of, 297 
—, radius of, 289 

—, statistical model of, 285 
Atomic form factor, 355 

— shell, 297 

Austausch integral, 267 


Band, 343 

Barrier, rectangular one-dimensional, 
45 

—, scattering by, 367 

Baryon, 550 


613 


— meson interaction, 596 

— number, 551 

— resonance, 550 

B-decay, 506 

Bilinear transformation, 246 

Bohr magneton, 236 

— quantization rule, 156 

— radius, 141 

Born approximation, 350 

— —, applicability of, 352 

Bose—Einstein particles, 257, 424 

Bosons, 257, 424 

—, wave function for, 257 

Boundary conditions on wave function, 
29 

Bound states and poles in scattering 
matrix, 413 

Box normalization, 89 

Bra vector, 188 

Breit—Wigner formulae, 399 

Bremsstrahlung, 533 

Broglie wave, de, 15 

— ’s formula, de, 15 


Canonical transformation, 177 

Capture, neutron, 400 

Causality, law of, 25 

Centrally symmetric field, 122 

— — —, conservation laws in, 122 

— — —, quasi-classical method in, 162 

Centre-of-mass system, 347 

Centrifugal energy, 125 

Channel, open, 385 

—, reaction, 385, 404 

Charge conjugation operator, 490 

— density and Dirac equation, 482 

— — and Klein—Gordon—Fock equation, 
472 








614 SUBJECT INDEX 


Chebyshev—Hermite polynomials, 40 
Chemical binding, 327 
Chronological operator, 410 
Classical limit, 145 

— — of cross section, 382 

— — of scattering, 378 

-- — of scattering amplitude, 380 
Clebsch—Gordon coefficients, 205 
Coherent scattering, 456 

Collision broadening, 466 

— of identical particles, 254 

— of hard identical balls, 369 

— of spin 4 particles, 370 

— of the second kind, 346 

— vector, 351 

Commutation relations of angular mo- 

mentum, 102 
Commutator, 62, 78, 85 
Commuting operators, 61, 78 
Complete set of eigenfunctions, 70 
— — of quantum-mechanical quantities, 
80 

Composite models, 605 

Compton scattering, 531, 535 
Conjugate matrix, 172 
Conservation law, 111 

— — and interactions, 556 

— — and light emission, 445 

— — and selection rules, 452 

— —, particle, 32 

— of symmetry property, 257 
Constant of the motion, 111 
Continuous energy spectrum, 54 

— matrix, 183 

Coordinate representation, 75 

— operator, 88, 185 

— —, eigenvalues of, 91 
Corpuscles, 4 

Correspondence principle, 22, 72 
Coulomb field, 135 

— system, 136 

Coupling constant, 506 

—, Jj-, 298 

—, Russel—Saunders, 298 

CPT invariance, 509 
Creation operator, 198, 425 
Cross section and partial wave theory, 

361 


— — and scattering matrix, 404 
— — and unitary invariance, 597 
— —, classical limit of, 382 
— —, convergence of partial, 362 
— —, differential, 345 
— — for photoelectric effect, 456 
— — for Raman scattering, 461 
for scattering of identical particles, 
368 
, inelastic scattering, 388 
, partial, 361 
— —,— inelastic scattering, 387 
Current density and Dirac equation, 482 
— — and Klein—Gordon—Fock equation, 
472 
— —, probability, 31 
— — vector for particles with spin, 252 


D-coupling, 596 

Degeneracy, 37, 65, 214 

—, accidental, 140 

— in one-dimensional motion, 53 

Degenerate level, perturbation theory of, 
214 

Detailed balance, 421 

Determinant, 171 

Deuteron, 314, 577 

—, model for, 317 

—, potential energy of, 315 

—, quadrupole moment of, 317 

—, states of, 316 

Diagonalization, 181 

Diagonal matrix, 167 

Diamagnetic susceptibility, 314 

Diatomic molecule, 323 

— —, rotation of, 339 

— —, vibration of, 340 

Dielectric constant, 461 

Differential cross-section, 345 

— operator, 60 

Dimension of representation, 558, 565 

Dipole—dipole interaction, 334 

— moment of atom, electric, 298 

— radiation, 447 

— —, magnetic, 449 

— transition, 447 

— —, selection rules for, 451 

Dirac bispinor, 481 








SUBJECT INDEX 615 


— 6-function, 69 
equation, 478, 518 
— — and electromagnetic field, 523 
— — and free particle, 483 
— — and hydrogen atom, 497 
— and Lorentz transformation, 501 
— — and Pauli equation, 494 
— and rotation, 499 
— — and spin particle, 492 
—, non-relativistic limit of, 496 
— notation, 188 
— perturbation theory, 220 
Discrete energy spectrum, 53 
Dispersion formula, 459 
— relations, 415 
Displacement operator, 90 
Doppler broadening, 466 
Double scattering, 376 
Duality, wave—particle, 5 
Dyson’s formula, 411 


Ehrenfest theorem, 110 

Eigenfunction, 64, 180 

— incentrally symmetric field, 123, 126 

Eigenvalue, 64, 180 

Eightfold way, 591 

Elastic scattering, 346 

Electric dipole moment of atom, 298 

Electromagnetic field, Hamiltonian of, 436 

— —, momentum of, 438 

— with electron, 440 

interaction, 554 

vacuum, 543 

— —, electron in, 543 

Electron, classical radius of, 455 

-, equivalent, 295 

— states in atom, 293 

Elementary particles, properties of, 550 

— —, stable, 552 

n-meson, 550 

Emission of light, 442 

—, spontaneous, 445 

—, stimulated, 444 

Energy conservation, 112 

— levels of the atom, ordering in, 290 

— — of helium atom, 278 

— spectrum in centrally symmetric field, 
128 


| 


l 


— — of hydrogen atom, 139 

Euler angles, 244 

Exchange energy and chemical binding, 
332 

— — in nuclei, 273 

— —, saturation of, 271 

— integral, 267 

— interaction, 264, 269 

— operator, particle, 256 

Excited state, 33 

External field, uniform, 54 

— line, 530 


F-coupling, 596 

Fermi constant, 555 

— Dirac particles, 257,430 

Fermions, 257, 430 

—, wave function for, 257 

—, wave function for two, 261 

Feynman diagram, 529 

— formulation and perturbation theory, 

232 

Filled shells and Thomas—Fermi model, 
289 

Final state, 401 

Forbidden transition, 117, 448 

Form factor, atomic, 355 

— — of hydrogen atom, 356 

Four-fermion interaction, 513 

Franck—Hertz experiment, 4 

Free particle and Dirac equation, 483 

— —, system of, 13,56 


Gell-Mann mass formula, 603 
— Nishijima relation, 554 
Generators, group, 559 

—, representation, 560 
y-matrices, 482, 502 
Gravitational interaction, 556 
Green’s function, 62 

— — and Dirac equation, 518 
— — for two-particle system, 524 
— ~, relativistic, 526 

Ground state, 33 

Group generators, 559 

— representation, 558 

— velocity, 15 








SS SO 


616 SUBJECT INDEX 


Hadenon, 550 
Hadron, 550 
—, isomultiplets of, 568 


Hamiltonian for electromagnetic field, 


436 


— in second quantized representation, 


428 

— operator, 92 

Harmonic oscillator, linear, 38, 195 
— —, three-dimensional, 43 
Hartree equations, 284 
_— Fock equations, 285 

— — method, 282 

Heisenberg representation, 192 

— ’s uncertainty relations, 20, 82 
Helicity operator, 512 

Helium atom, 276 

— —, eigenstates of, 278 

— —, energy levels of, 278 
Hermite polynomials, Chebyshev, 40 
Hermitian matrix, 167, 173 

— operator, 65 

— —, eigenvalues of, 67 

Hilbert space, 175 

Homopolar molecules, 327 
Hybridization, 334 

Hydrogen atom, 121, 135, 217 

— —, form factor of, 356 

— —, relativistic corrections, 498 
— — with finite nucleus, 217 

— molecule, 327 

— spectrum, 141 

Hypercharge, 554, 585 


Identical balls, collision of hard, 369 
— particles, collision of, 254 

— —, scattering of, 368 

— —, wave function for, 255 

— —, wave function for two, 261 
Impact parameter, 363 

Inelastic scattering, 346, 385 

— = cross section, 388 

Initial state, 401 

Integral operator, 60 

Interaction constant, 440 

—, dipole—dipole, 334 

—, electromagnetic, 554 

—, exchange, 268 


—, four-fermion, 513 

—, gravitational, 556 

—, interatomic, 334 

— of atoms in different states, 335 
—, quadrupole—quadrupole, 335 

—, relativistic, 526 

— representation, 194 

—, spin—orbit, 377 

—, strong, 476, 555 

—, van der Waals, 335 

—, weak, 504, 555 

Intermediate state, 228 

Internal line, 530 

Invariant amplitudes, method of, 583 
— operators, 561 

Inverse matrix, 171 

— operator, 62 

Inversion operator, 114, 325 

— — and spinor transformation, 247 
Inverted multiplets, 295 

Ions and the Thomas—Fermi model, 291 
Irreducible representation, 558 
Isogroup, 562 

Isometric nuclei, 452 

Isomultiplets, 551, 567 

— of hadrons, 568 

Isospin, 551, 568 

— component operators, 567 

— conservation, 577 

—, third component of, 568 


jj-coupling, 298 


Kernel of integral operator, 60 

Ket vector, 188 

=-hyperon, 550 
Klein—Gordon—Fock equation, 468 
— — — — and charge density, 472 
— — — — and current density, 472 
Klein—Nishina formula, 542 
K-meson (kaon), 507, 550 
Kronecker symbol, 68 


Laboratory system, 347 

Laguerre polynomial, generalized, 139 
Lamb shift, 543 

Landé g-factor, 311 

Laplacian, 60 





SUBJECT INDEX 617 


A-hyperon, 550 

Lepton, 515, 549 

— number, 551 

Lie group, 559 

— —, dimension of, 559 

Light absorption, 442 

— emission, 442 

— scattering by atoms, 456 

Linear operator, 60 

Linewidth, natural, 462 

Lorentz transformation and Dirac 
equation, 501 


Magic numbers, 322 

Magnetic field, atom in, 308 

— —, particle in, 93 

— moment operator, intrinsic, 251 
— quantum number, 104 
Many-particle system, 55, 94 

— — —, non-interacting, 13, 56 
— spin particle system, 253 
Matrices, product of, 169 

—, sum of, 169 

Matrix, 164 

— calculus, 168 

— element, 165 

Maxwell’s equations, 432 

Mean value, 76 

Measurement, 23, 74, 119 

—, simultaneous, 79 

Mendeleyev periodic system, 299 
Mesohydrogen, 142 

Meson, 550 

— baryon interaction, 596 

— resonance, 550 

Metal, electrons in a, 230 
Metastable state, 452 
Microparticles, 4 

Mixed state, 81 

u-meson (muon), 505, 549 
Molecular and atomic states, 337 
— spectrum, 342 

— states, 325 

Molecule, diatomic, 323 

—, geometric form of, 333 
Momentum conservation, 112 

— operator, 73, 87 

— — and infinitesimal translation, 90 





— —, eigenfunctions of, 88 
— representation, 88, 185 
— transfer vector, 351 
Moseley law, 305 
Multiplet, 558 

—, inverted, 295 

—, normal, 295 
Multiplicity, 561 

— of degeneracy, 37 
Muon, 505, 549 


Neutrino, 549 

—, two-component, 509 

Neutron absorption by nuclei, 393 
— capture, 400 

— nucleus scattering, 377, 391 

— — —, inelastic, 397 

— proton scattering, 581 

Normal multiplets, 295 
Normalized eigenfunction, 68 
Nuclear field, 473 

— forces, theory of, 272 

— shell, 320 

Nuclei, self-consistent field in, 320 
Nucleon—antinucleon system, 573 
— charge, 569 

— nucleon scattering, 580 

— — system, 574 

— n-meson interaction, 578 

— — scattering, 580 

— — system, 575 

— resonances, 572 

Null matrix, 168 

Number operator, 425 


Occupation number representation, 423 

Q-hyperon, 550, 594 

Okubo mass formula, 603 

One-dimensional motion, 52 

Operator, 60, 188 

— and physical variables, 71 

— in second quantization.representation, 

425 

— product, 61 

Optical model for scattering, 391 

— theorem, 390 

Orbital angular momentum quantum 
number, 105 








618 SUBJECT INDEX 


Orthogonal eigenfunctions, 67 
Orthohelium, 277 
Orthostates, 277 

Oscillator strength, 461 


Pair production frequency, 546 
Parahelium, 277 

Parastates, 277 

Parity, 115, 551 

— conservation, 117 

— non-conservation, 507 

Partial cross section, 361 

— — —, convergence of, 362 

— inelastic cross section, 387 

— wave theory, 357 

— width, 400 

Particle in field, 28 

—, stable, 549 

Paschen—Back effect, 313 

Pauli equation, 251 

— — and Dirac equation, 494 

— — for many-particle system, 253 
Pauli—Luders theorem, 551 

Pauli operators, 240 

— principle, 260 

P-branch, 342 

Penetration, 53, 158 

Periodic boundary conditions, 89 

— system, 299 

Perturbations, adiabatic theory of, 229 
— calculation for helium atom, 278 
— corrections to energy, 212, 213 
— — to wave function, 212, 213 

— operator, 210 

— theory and scattering matrix, 408 
— — and Feynman formulation, 232 
— -, Dirac, 220 

— — of degenerate level, 214 

—, time-dependent, 219 

—, transition due to, 222 

Phase shift, 358 

— —, uncertainty in, 407 

— velocity, 15 

Photoelectric effect, 452 

— —, cross section for, 456 
Photon, 432, 549 

—, interaction of, 435 

— spin, 435, 442 


— transition, one-, 442 

—, wave function of, 433 

Physical variables and operators, 71 

m-meson (pion), 274, 473, 550, 571 

— nucleon interaction, 578 

— — scattering, 580 

— — system, 575 

— production, 582 

Poisson bracket, 84 

Polarization and scattering, 371 

— of scattered beam, 376 

— vector, 372 

Polar spinor, 248 

Poles, false, 413 

— in scattering matrix and bound states, 

413 

Positron, 487 

—, Feynman diagram for, 531 

— state, 523 

Potential, complex, 390 

— well, 32, 155 

Principal quantum number, 138 

Probability density, 11 

Propagation function of the virtual pho- 
ton, 507 

Proton—neutron scattering, 581 

— n-meson scattering, 582 

Pseudo-spinor, 248 

Pure state, 80 


Q-branch, 343 

Quadrupole radiation, 449 

— quadrupole interaction, 335 

Quantum-mechanical system, properties 
of, 35 

Quark, 607 

Quasi-classical approximation, 145 

— —, validity of, 150 

— wave function, 153 


Radial quantum number, 138 
— wave equation, 125 
Raman effect, 459 

— scattering, 456 

Rank, group, 561 

— of a matrix, 171 
R-branch, 342 

Reaction channel, 385, 404 


a 


SUBJECT INDEX 619 


Reciprocity theorem, 420 


Rectangular potential well, one-dimen- 
sional, 32 


— — —, three-dimensional, 36 
Reducible representation, 558 
Reflection coefficient, 47 
Refractive index, 461 
Relativistic interaction, 526 
Renormalization, 535 
Representation, coordinate, 75 
—, dimension of, 558 

—, general, 75 

-- generators, 560 

—, irreducible, 558 

— ofa group, 558 

—, operator, 177 

—, scalar, 563 

—, spinor, 563 

—, vector, 566 

Resonance, 549, 553 

—, baryon, 550 

— fluorescence, 462 

—, meson, 550 

— scattering, 364, 399 

— transition, 226 

Reversal in time, 30 

Rotation matrix, 242 

— of diatomic molecules, 339 
— operator, 107 
Russel—Saunders coupling, 298 
Rutherford formula, 355 
Rydberg constant, 141 


Sakata model, 605 

Saturation of exchange energy, 271 
Scalar representation, 563 
Scatterer, 345 

Scattering amplitude, 347 

— — in Born approximation, 350 
— — in classical limit, 380 

— — and partial wave theory, 360 
and polarization, 371 

- by barrier, 367 

by spherical well, 364 

—, classical limit of, 378 

—, coherent, 456 

— cross section, differential, 345 
— — -, inelastic, 388 


! 


— — -, partial inelastic, 387 

—, double, 376 

—, elastic, 346 

—, inelastic, 346, 385 

— matrix, 401 

— —, analytic properties of, 412 
— — and perturbation theory, 408 
— — and time reversal, 418 

— — and transition probability, 403 
—, neutron—nucleus, 377, 391 
—, neutron—proton, 581 

— of charged particles by atoms, 354 
— of identical particles, 368 

— of spin i particles, 370 

— of slow particles, 394 

—, m-meson—proton, 582 

— process, 345 

—, Raman, 456 

— resonance, 364, 399 

— theory, partial wave, 357 
Schrodinger equation, 29 

— —, integral form of, 96 

— — in matrix form, 186 

— representation, 191 
Screening of nuclear charge, 356 
Second quantization, 423 
Secular equation, 216 

Selection rules, 342, 449, 561 
Self-consistent field in nuclei, 320 
— — method, 282 

Shell, atomic, 297, 299 

—, closed, 297, 321 

—, nuclear, 320 

Shur lemma, 561 

D-hyperon, 550 

Sign function, 410 

Spectral linewidth, 118 

— term, 140 

Spectrum of an operator, 64 

—, molecular, 342 

Spherical well, 132 

— —, scattering by, 364 

Spin, classical limit of, 240 
component operators, 239 

— — —, eigenvalues of, 241 

— — —, rotation of, 242 

—, electron, 234 

— operator, 236 








620 SUBJECT INDEX 


— orbit interaction, 294, 377 

— particle and Dirac equation, 492 
— — system, many-, 253 

— quantum number, 235 

—, wave function for particle with, 237 
Spinor, 245 

—, polar, 248 

—, pseudo-, 248 

— representation, 563 
Spontaneous emission, 445 
Spread of wave packet, 19 
Standard conditions, 64 
Stationary states, 28, 94 
Statistical model of the atom, 285 
Stern—Gerlach experiment, 5, 235 
Stimulated emission, 444 
Strangeness, 554, 585 

Strange particle, 554 

Strong interaction, 476, 555 
Sub-levels, 214 

SU(2) group, 562 

— —, representations of, 563 
SU(3) group, 586 

— —, invariant operators of, 589 
— — representations, 587 
Superposition principle, 18 
Symmetric wave function, 256 
Symmetry group, 557 

— transformations, 557 


Thomas—Fermi equation, 287 

— — model and filled shell, 289 

— — — and ions, 291 

— — — and quasi-classical approxima- 

tion, 292 

Thomson formula, 543 

Threshold approximation, 394 

Time—energy uncertainty relation, 117 

Time-ordering operator, 410 

Time-reversal, 418 

` Trace of a matrix, 171 

Transition amplitude, 96, 528 

— —, Fourier transform of, 99 

— —, integral equation for, 101 

— due to perturbations, 222 

— due to time-independent perturbations, 
227 

line, shape of, 466 


— probability, 222, 226 

— — and scattering matrix, 403 
— via intermediate states, 228 
Transmission coefficient, 47, 51, 160 
Transpose of an operator, 66 
Trial function, 281 

Triangle relation, 583 

Tunnel effect, 48, 51 

Turning point, 148 

— —, wave function in, 150 
Two-particle system, 57 


Uncertainty relations, 19, 82 

— —, time—energy, 117 

Unit matrix, 168 

Unitary group, 586 

— invariance and cross sections, 597 
— matrix, 173 

— symmetry, 609 

— —, violation of, 601 

— transformation, 177 


Vacuum, 438, 543 

—, electron in, 544 

Valence, 306, 332 

Variational principle, 280 

— — and Hartree—Fock approximation, 

282 

Variation of constants method, 220 

Vector, n-dimensional, 174 

— representation, 566 

Vibration of diatomic molecules, 340 

Virtual photon, propagation function of, 
527 

— state, 228 

Vortex, 529 


Waals interaction, van der, 335 
Wave function, 10 

— —, boundary conditions on, 29 
— — for bosons, 257 

— — for fermions, 257 

— — for identical particles, 255 
for particle with spin, 237 
— — for two fermions, 261 

— —, N-particle, 12 

— packet, 19 

Weak interaction, 504, 555 








SUBJECT INDEX 621 


Weight diagram, 594 

Wentzel—Kramers—Brillouin method, 
145 

Wigner formula, 366 


Yukawa potential, 475 










Zeeman effect, 307 

Zero-point energy, 43 

— — oscillations of electromagnetic 

field, 438, 544 s 





© NORTH-HOLLAND PUBLISHING COMPANY, 1973 

All rights reserved. No part of this book may be reproduced, stored in a retrieval 
system, or transmitted, in any form or by any means, electronic, mechanical, photo- 
copying, recording or otherwise without the prior permission of the Copyright owner. 


Library of Congress Catalog Card Number: 68 54501 


ISBN North-Holland, complete set: 0 7204 0176 3 
Vol. 4: 0 7204 0202 6 


Printed in The Netherlands 


Title of the Russian edition: 
KURS TEORETICHESKOJ FIZIKI 


Russian edition published by: 
IZDATELSTVO ‘NAUKA’, GLAVNAJA REDAKCIJA, 


FIZIKO-MATEMATICESKOJ LITERATURY (MOSKVA, 1971) 


Publishers: 
NORTH-HOLLAND PUBLISHING COMPANY - AMSTERDAM 


Sole Distributors for the Western Hemisphere: 

WILEY INTERSCIENCE DIVISION 

JOHN WILEY & SONS, INC. - NEW YORK 
ISBN Wiley Interscience, Vol. 4: 0-471-53116-2 


FOREWORD 


The first Russian edition of ‘Theoretical Physics’, which appeared in 1962, 
has been widely used as a textbook. 

Numerous comments from colleagues, lecturers and students have been 
taken into account in preparing this new edition, which is the first one in 
English and which will also appear as the second Russian edition. 

The material has now been divided into 4 volumes covering the following 
subjects 


Volume 1 
PartI Theory of the Electromagnetic Field 
Part II Theory of Relativity 


Volume 2 
Part III Statistical Physics 
Part IV Electromagnetic Processes in Matter 


Volume 3 
Part V Quantum Mechanics 


Volume 4 
Part VI Quantum Statistics and Physical Kinetics 


The rapid development of physics and the present wide interest in 
non-equilibrium and non-stationary processes has compelled us to expand the 
section on physical kinetics. It has also been transferred to the end of 
Volume 4 as it is practically impossible to expound this topic without using 
quantum mechanics. 

Part IV — ‘Electromagnetic Processes in Matter’ — has been substantially 
revised. Interest in this field has increased recently, mainly in connection with 
the study of plasmas and plasma-like media, which now have sections devoted 
to them. 





vi FOREWORD 


The methods of calculating electrostatic and direct-current fields, and 
other problems of classical electrodynamics in a medium, are covered very 
briefly as we have assumed that students will be able to consult the many 
monographs and handbooks on general physics, electrical- and radio- 
technology, and the equations of mathematical physics. 

As for other modifications and additions, we should draw attention to the 
introduction of tensor notation, to new ideas in the theories of relativity and 
electromagnetic fields, the broadening of the introduction to the theory of 
probability, a brief presentation of the method of correlation functions in 
statistical physics, the exposition of the thermodynamic theory of ferro- 
magnetism and the theory of propagation of electromagnetic waves in plasma. 
A number of paragraphs have been rewritten. We have tried to bring the 
content of the book even closer to the interests of present-day theoretical 
physics. 

The general level of the book has been preserved and it is still intended to 
form an introduction to theoretical physics. Problems requiring the use of 
cumbersome or special mathematical apparatus are still excluded, and the 
most difficult sections are marked by an asterisk. These may be skipped at 
will, since there is no reference to them in the main text. 

In conclusion we would like to express our gratitude to all those who 
helped us in preparing this book, in particular to A.M. Brodsky, A.M. 
Golovin, B.M. Grafov, R.R. Dogonadze, V.S. Krylov and especially V.S. 
Markin and V.V. Tolmachev. I.V. Savelyev discovered a number of misprints 
which have now been corrected. 

L.D. Konkina helped us in editing the manuscript. 

We are grateful to the readers and students who used the first Russian 
edition of the book for sending us their valuable comments which have been 
taken into account in this edition. 

August 1970 





FOREWORD TO THE FIRST RUSSIAN EDITION 


The continuous development of theoretical physics and the regular 
expansion of its areas of application create increasing demand for textbooks 
and manuals. 

The rapid development and the complexity of the most recent experi- 
mental methods of physical investigation, and the corresponding development 
and extension of the mathematical apparatus of theoretical physics, have 
meant that one man usually cannot combine the two methods of investiga- 
tion. The end of the 19th century and particularly the 20th century therefore 
saw physicists divided into ‘experimentalists’ and ‘theoreticians’, the latter 
studying physical laws by means of the mathematical methods of theoretical 
physics. 

Obviously, a background in theoretical physics is essential in the education 
of experimental as well as theoretical physicists. 

The experimental and theoretical methods of physical investigation have 
penetrated into a number of branches of science related to physics (physical 
chemistry, biophysics, geophysics, astrophysics, and so on) and into technolo- 
gy (metal physics and metallurgical science, thermophysics, electrical technol- 
ogy, radiotechnology, computation, the instrument-making industry etc.). 
Workers in these branches of science and technology also need a certain 
minimum knowledge of theoretical physics. 

The compilation of a modern textbook on theoretical physics is inevitably 
associated with certain logical and methodological difficulties. It is impossible 
at present to divide theoretical physics into classical and quantum parts so 
that it is also impossible to divide it into separate chapters and sections. For 
example, the exposition of statistical physics. without taking into account the 
quantum properties of atomic systems is impossible, for it would mean that 
the general theory remained without practical application. In the theory of 
electromagnetic processes in matter one has of necessity to make use of the 
ideas of statistical physics, and so on. It may be that the maximum 
consistency of composition would be obtained if the book were founded on 


vii 





— m mm ee S 


viii FOREWORD TO THE FIRST RUSSIAN EDITION 


quantum mechanics but this is completely inadmissible in a book intended as 
an introductory treatise. Quantum mechanics requires a certain preparedness 
and the student must be convinced of the necessity of renouncing obvious 
classical representations. Compromise solutions, which have justified them- 
selves during many years of teaching theoretical physics at the Moscow 
Engineering-Physical Institute and Moscow State University, are therefore 
inevitable. 

The following general principles have been applied. 

(1) The book is written as an introduction to theoretical physics so that 
aspects requiring the use of cumbersome or special mathematical apparatus 
have not been included. 

(2) As it is to be used for a systematic study of the subject the course is a 
unique whole and all material necessary for understanding the later sections is 
contained in the earlier ones. 

(3) It would not be feasible to elucidate experimental facts in addition to 
problems concerning purely theoretical physics. However, physics is a single 
science, and an attempt to expound the theoretical aspects without taking 
experiment into account would be quite wrong. The reader is assumed to 
have some basic experimental knowledge from university courses in general 
and atomic physics so that we have confined ourselves to references and, in a 
few instances, to a schematic description of basic experiments. 

(4) The acquaintance assumed with general courses in general and atomic 
physics has allowed us to rely on a certain (very restricted) knowledge of 
quantum mechanics in our treatment of statistical physics. 

(5) Classical mechanics usually forms a separate course so that this topic 
has been omitted although detailed reference has been made to handbooks of 
mechanics. 

(6) The book similarly does not cover hydrodynamics, aerodynamics, the 
theory of heat transfer, or problems related to electrical- and radio- 
technology. 

(7) Detailed reference is made to mathematical manuals. The mathematical 
apparatus utilized, except in the sections marked by an asterisk, is covered by 
the usual courses in analysis. In the case of quantum mechanics, however, the 
mathematical apparatus has been included, since it is of a specific character 
and is not taught in traditional mathematical courses. 

(8) As the book is intended as a systematic course in theoretical physics no 
attempt has been made to achieve the same level of accessibility in all 
sections. It is a well-known fact that a student’s comprehension and 
assimilation of difficult material increases as a course progresses, and that this 
is also true for the associated mathematical apparatus. Moreover, experi- 


FOREWORD TO THE FIRST RUSSIAN EDITION ix 


mental physicists will constantly encounter new problems in quantum 
mechanics which can only be handled using advanced methods of treatment. 
The section on quantum mechanics (Part V) therefore deals with some topics 
having a more advanced character than those in other sections. The analysis 
of applications of the kinetic equations is similarly treated rather extensively. 


The uniqueness of the book’s objectives has affected the content of individual 
sections, so that some topics in modern physics have been included at the 
expense of more traditional material. 

Part I contains the foundations of the theory of the electromagnetic field 
in a vacuum, based on the system of Maxwell-Lorentz equations. A basic 
knowledge of electromagnetism is assumed. The focus of attention is the 
theory of radiation and the motion of charged particles in external fields. 

In Part II, devoted to the theory of relativity, a four-dimensional form of 
representation is adopted which not only corresponds to the spirit of the 
theory but also predominates in contemporary literature. The problems of 
dynamics in the theory of relativity are treated in some detail. A number of 
the most recent applications of the theory of relativity, particularly those 
related to nuclear physics, are covered here for the first time in a textbook. 

Part III is a revised version of Levich’s ‘Introduction to Statistical Physics’ 
and treats statistical physics and the fundamentals of statistical thermo- 
dynamics. Classical thermodynamics would require too much space, and did 
not seem indispensable. 

Part IV contains the theory of electromagnetic processes in matter. 
Relatively little attention is paid to problems in theoretical electrical- and 
radio-technology. The phenomenological theory of electric and magnetic 
properties of matter is analyzed in some detail, and the notion of the physics 
of the plasma state of matter is given. 

In Part V the basic ideas of present-day relativistic quantum mechanics are 
included as well as the traditional problems of non-relativistic quantum 
mechanics. Applications to solid-state theory are considered at length. 

Part VI contains the essential concepts of physical kinetics, which are not 
usually presented in a general course on theoretical physics. 


The experience of teaching theoretical physics shows that the greatest 
difficulties are often encountered not in understanding new physical ideas but 
in the actual mathematical treatments. All mathematical operations have 
therefore been performed in sufficient detail. 

For convenience we have presented a brief derivation of those formulae of 


FOREWORD TO THE FIRST RUSSIAN EDITION 





vector analysis which are encountered throughout, as well as the necessary 
data on Fourier integrals and 6-function theory. 

The numbering of formulae and sections starts afresh in each Part and 
references to appendices have been given Roman numerals. 

The author hopes that the readers, after making themselves familiar with 
the foundations of theoretical physics expounded in this book, will be able to 
proceed to a more profound study using the many-volume treatise of Landau 
and Lifshitz. The scientific and educational ideas of their work were of great 
influence on the author, who is a disciple of Landau. 

Parts I—IV and Part VI were written by B.G. Levich. Part V was written by 
Y.A. Vdovin and V.A. Myamlin under the general scientific guidance of B.G. 
Levich. Chapter XV of Part V was written by A.I. Naumov. 

The author expresses his gratitude to the colleagues who read the book 
and the manuscripts, and made a number of valuable remarks: B.M. Grafov, 
R.R. Dogonadze, V.A. Kiryanov, V.S. Krylov, V.S. Markin, V.P. Smilga, Y.A. 
Chizmadzhev and Y.I. Yalamov. 

The creation of a textbook on theoretical physics sufficiently comprehen- 
sive in content and clear in presentation is a very complex task. The author is 
therefore conscious of the fact that shortcomings and errors will be discover- 
ed and would be grateful to receive an account of them which can be taken 
into consideration in the next edition of the book. 


1962 


Volume 1 


Part I 
Chapter 1 
5 


Anh WE 


Part II 


Chapter 1 
2 
3 


Theoretical Physics: 
Outline of Vols. 1—4 


Theory of the Electromagnetic Field 


General theory of the electromagnetic field 

The electrostatic field 

The quasistationary magnetic field 

The electromagnetic field of arbitrarily moving charges 

Radiation theory 

Electromagnetic field in a vacuum and electromagnetic wave 
scattering 

The motion of particles in electromagnetic fields 


Theory of Relativity 


General principles of the theory of relativity 
Relativistic mechariics 
Relativistic electrodynamics 


Appendix I, II and III 


Subject index 


Volume 2 


Part III 


Chapter 1 
2 


Statistical Physics 


The basic concepts of the theory of probability 
The kinetic theory of gases 





xii 


OO ANDMN Aw 


— 


Part IV 
Chapter 1 


NnhW NP 


OUTLINE OF VOLUMES 1—4 


Statistical distribution 

Statistical and phenomenological thermodynamics 

Ideal gases 

Systems of interacting particles 

Crystals 

The theory of fluctuations 

Systems with a variable number of particles 

Statistical distributions in quantum statistics and some of their 
applications 


Electromagnetic Processes in Matter 


Electromagnetic fields in matter 

Electrostatics 

Direct electric current and the magnetic properties of matter 
Quasistationary electromagnetic fields 

High-frequency fields 

Matter in the plasma state 


Appendix IV 


Subject index 


Volume 3 


Part V 
Chapter 1 


OMDAINDNAHRWYN 


10 
11 


Quantum Mechanics 


The basic concepts of quantum mechanics 

The Schrödinger equation 

The mathematical apparatus of quantum mechanics 
Motion in a centrally symmetric field 

The quasi-classical approximation 

The matrix form of quantum mechanics 
Perturbation theory 

Spin and identity of particles 

Applications of quantum mechanisms to atomic and nuclear 
systems 

The theory of diatomic molecules 

Scattering theory 





12 
13 
14 
15 





OUTLINE OF VOLUMES 1-4 xiii 
The method of second quantization and radiation theory 
Relativistic quantum mechanics 

Some problems of quantum electrodynamics 

Fundamentals of the theory of elementary particles 


Subject index 


Volume 4 (for details see p. xv) 


Part VI 
Chapter 1 


DUVAR UN 


Quantum Statistics and Physical Kinetics 


Quantum statistics 

Physical kinetics 

The kinetic theory of gases and gas-like systems 

The time correlation function method and Onsager’s theory 
Solid-state theory 

The kinetic properties of solids 

Interaction of radiation with a free-electron gas 


Subject index 








Contents of Volume 


Part VI Quantum statistics and physical kinetics 


Chapter 1 Quantum statistics 


§ 


NAUN 


The density matrix and the statistical operator 
The statistical distribution in quantum statistics 
Statistical distributions in an ideal gas 
Degenerate ideal Bose gas 

Imperfect Bose gas. Superfluidity 


Chapter 2 Physical kinetics 


6 


10 
11 
12 


Phenomenological transport equations and the general equa- 
tions of kinetics 

The mass conservation law and diffusion flow 

The momentum conservation law and the equations of mo- 
tion for a continuous medium 

The energy conservation law and entropy transport in a 
moving continuous medium 

The Fokker—Planck equation 

The basic kinetic equation 

Discussion of the basic kinetic equation and some simple ex- 
amples 

Non-equilibrium systems with a negative temperature. The 
amplification of electromagnetic waves by such systems 


Chapter 3 The kinetic theory of gases and gas-like systems 


14 
15 
16 


Boltzmann’s kinetic equation 

The basic kinetic equation for correlation functions 

The derivation of Boltzmann’s equation from the basic ki- 
netic equation 


XV 


15 
18 
24 


33 
36 


40 


50 


58 


64 


68 


79 


83 





XVi 
§17 


18 
19 
20 


28 


29 
30 
31 
32 
33 
34 


35 


CONTENTS 


The generalized transport equation and the properties of 
summational (additive) invariants 

The equations of motion of a continuous medium 

The laws of increase of entropy 

Equilibrium and local-equilibrium distributions in an ideal 
gas 

The general theory of the solution of Boltzmann’s equation 


22 The equations of hydrodynamics. The viscosity and thermal 


conductivity of gases 

Relaxation time 

The diffusion of an admixture of a gas of light particles into 
a gas of heavy particles 

Thermal diffusion in gases 

The dispersion of sound 

The linearized Boltzmann’s equation for quasi-gaseous sys- 
tems 

The solution of Boltzmann’s equation for a quasi-gaseous 
system in an external force field 

The kinetic equation for polyatomic gases 

The moderation of fast neutrons 

The spatial distribution of neutrons 

Kinetic equation for a plasma disregarding collisions 

The dispersion and damping of plasma waves 

The kinetic equation for a plasma taking into account colli- 
sions 

The establishment of equilibrium in an electron—ion plasma 


Chapter 4 The time correlation function method and Onsager’s 


36 


37 


38 
39 


40 
4] 
42 


theory 


The response of a system to an external dynamic perturba- 
tion. Classical calculation 

The response of a system to an external dynamic perturba- 
tion. Quantum calculation 

The response of a system to a thermal perturbation 

The calculation of kinetic coefficients. The connection with 
Boltzmann’s equation 

Onsager’s theory 

Discussion of Onsager’s relations 

Non-equilibrium processes in one-component systems 


Ss EE 


90 
93 
99 


101 
106 


126 
133 
136 


138 


142 
145 
151 
161 
166 
169 


179 
185 


190 


197 
199 


204 
208 
213 
215 


§43 Non-equilibrium processes in many-component systems (dif- 


44 


CONTENTS 


fusion, thermodiffusion, thermoelectric effects) 
Fluctuation-dissipation theorem 


Chapter 5 Solid-state theory 


45 
46 
47 
48 
49 


50 
51 
52 


53 
54 
55 
56 


A solid body as a quantum-mechanical system 

The crystal lattice 

Lattice vibrations 

Wave function of an electron moving in a periodic field 
The energy spectrum of an electron moving in a periodic 
field 

A system of electrons in a solid 

Models of a metal, a semiconductor and a dielectric 
Magnetic properties of metals. The paramagnetism of an 
electron gas 

The diamagnetism of an electron gas 

Ferromagnetism 

The interaction of electrons with lattice vibrations 

The total Hamiltonian of a solid 


Chapter 6 The kinetic properties of solids 


57 
58 
59 
60 
61 
62 
63 
64 
65 
66 
67 
68 


69 


The kinetic equation for electrons in metals 

The electrical conductivity of metals 

The Hall effect 

The optical properties of a system of conduction electrons 
The photoelectric effect 

The mean free path length of electrons in metals 

The collision integral for electrons in a metal 

A solution of the kinetic equation 

Superconductivity 

Theory of the Fermi fluid 

Electrons in dielectric crystals 

The energy spectrum and the distribution function of elec- 
trons in semiconductors 

The electrical conductivity and the Hall effect in semicon- 
ductors 





xvii 


© 


N N 
N N 
WN 


NNN nN 
AUUN 
-nN O 


N 


260 
269 


271 
275 
281 
287 
294 


299 
302 
305 
310 
312 
317 
326 
329 
334 
344 
351 


355 


362 





xviii 


CONTENTS 


Chapter 7 Interaction of radiation with a free-electron gas“ 


§70 
71 
72 
73 
74 
75 


Low-density plasma in a low-frequency radiation field 
Kinetic equations for electrons and photons 

Kinetics of Bose condensation in a photon gas 

Mobility of an electron in a radiation field 

System of electrons in an arbitrary radiation field 

General discussion of the results and range of the applicabil- 
ity of the theory 


Subject index 


z Translated by Minerva Translations Limited, London. 


365 
371 
376 
380 
386 


388 


395 


PART VI 


QUANTUM STATISTICS 
AND PHYSICAL KINETICS 











Quantum Statistics 


§1. The density matrix and the statistical operator 


In this chapter we shall consider some problems of statistical physics asso- 
ciated with the use of the concepts of quantum mechanics. 

A system in quantum mechanics was assumed to consist of a relatively 
limited number of particles. However, as in classical physics, it is essential to 
carry out the transition to systems with very large numbers of particles. In 
other words, it is necessary to find a way of passing from quantum mechanics 
to quantum statistics. 

In treating statistical physics in Part III, we took into account from the 
very beginning those profound changes which are brought about by quantum 
phenomena. That is, in calculating the state function the summation was 
carried out over the energy levels of the system, and the statistical weight of 
states was determined on the basis of the assumption that to each state there 
corresponds a phase-space cell of volume (27fi)>. In other words, we were, 
from the very beginning, constructing statistical physics in the quasi-classical 
approximation. Furthermore, we have already seen that the identity of quan- 
tum particles leads to a fundamental change in the properties of their statis- 
tical ensembles. 

In quantum mechanics a substantiation of the quasi-classical approxima- 
tion was given and, through this, the basic propositions on which our exposi- 











4 QUANTUM STATISTICS Ch. 1 


tion of statistical physics was constructed were to a certain degree justi- 
fied. 

We are now going to pass on to a more logical construction of quantum 
statistical physics. 

Let us consider a certain subsystem interacting with a medium. The medi- 
um and subsystem together form a closed system whose behaviour is described 
by a wave function Y. This wave function depends on the state of the parti- 
cles of the subsystem as well as on the state of the particles of the medium. 
Since the subsystem interacts with the medium, it is impossible to find a wave 
function of the subsystem which depends only on the coordinates of its par- 
ticles and not on those of the particles of the medium as well. Hence one can- 
not assign an independent wave function to an open subsystem. In other 
words, the states of an open subsystem are a set of mixed states. This is often 
formulated briefly as follows: ‘an open subsystem is in a mixed state’. 

As always in statistical physics, we shall be interested in the values of 
quantities L averaged over time (or over an ensemble). In order not to con- 
fuse this mean with the ordinary quantum-mechanical average, we shall de- 
note the statistical mean by the symbol (L). We identify the quantity L with 
the operator È. Its quantum-mechanical mean is 


L-Jwiwar. (1.1) 
Here Ê and L refer to the quasi-closed subsystem and depend only on quan- 
tities describing it, whereas the integration is carried out over the entire 
closed system. 

Let us now consider our quasi-closed subsystem. We switch off its interac- 
tion with the external medium completely. Our subsystem, then becomes 
closed. The Hamiltonian of the quasi-closed subsystem, Ê+ Hints where Aint 
describes its interaction with the medium, goes over into the Hamiltonian H. 
For clarity, we shall call such a small closed system a closed subsystem. It is 
obvious that a closed subsystem possesses properties completely different 
from those of a quasi-closed subsystem. Its states do not depend on the state 
of the external medium and, consequently, are pure states. A closed subsys- 
tem can be described by a particular wave function. If we denote the set of 
quantum numbers characterizing the state of the closed subsystem by n, then 
the set of wave functions Y, of this subsystem forms a complete set of or- 
thonormalized functions. Hence we can write the expansion 


w= D cn: ey) 


The coefficients c,, depend on variables characterizing the state of the parti- 
cles of the external medium and on the time. However, they can always be 





§1 DENSITY MATRIX AND STATISTICAL OPERATOR 5 


considered to satisfy the normalization condition 
Žic, = vs (1.3) 


Substituting the expansion (1.2) into the definition of the quantum- 
mechanical mean, we get 


kf = * 
-2 acm fur LY m dr = p2 ChEmL nm * (1.4) 


Formula (1.4) gives a complete quantum-mechanical description of the quan- 
tity Ê. Its uncertainty, i.e. the fact that L can take on any value and that we 
can find only the mean value L, is associated with the essence of the quantum- 
mechanical description. If the subsystem were closed, the coefficients c, 
would not depend on time and lel? would give the probability of finding the 
closed subsystem in the nth state. The coefficients c,, for an open subsystem 
do not possess such a property. 
We now pass to the statistical mean, writing 


(L) = DS popes (1.5) 
The average over time is carried out fora time large in comparison with micro- 
scopic times. 
We now introduce a matrix determined by the set of elements Cm 4 0 ' The 
elements of this matrix are 
t 
Pmn = Chem 7 a 6) 


Then (1.5) can be written in the form 





D) z 2 PmnLnm S a 7) 
n,m 


Further, we introduce an operator À whose matrix elements are p,m: The 
matrix p,m is called the statistical matrix or density matrix, and the operator 


fp the statistical operator. It is obvious that the last formula can then be 
written in the form 


(L= 2 (= aada) F 2 (6E) yn = Tr pb o (1.8) 
m 1 


n 


We see that, in order to find the statistical mean of an operator L, it is 
necessary to know the density matrix (or the statistical operator) replacing 
the distribution function of classical statistics. 

In the statistical mean a further step is made in comparison with the 
quantum-mechanical mean: the interaction with the medium, which may have 





“> a a tL, ee ee 


6 QUANTUM STATISTICS Ch. 1 


a very complex character, is taken into account. Taking account of this inter- 
action accurately is replaced by the averaging over time of the whole set of 
coefficients Crom =e Com It is clear that the statistical description (definition 
of values <Z)) is incomplete as compared with the quantum-mechanical des- 
cription (i.e. definition of values L). 

In a description by means of a wave function it is possible, for example, to 
indicate possible accurate values of various quantities characterizing the sys- 
tem as a whole, even their unique possible values. In describing a system by 
means of the statistical matrix such predictions are impossible and we are 
forced to confine ourselves to the calculation of statistical means. In forming 
a statistical mean the particular features of the quantum-mechanical descrip- 
tion are taken into account automatically. The situation here is exactly the 
same as in classical mechanics and statistics. In passing to a system with a very 
large number of particles, the dynamical description is replaced by a statis- 
tical one. It is just this latter which is adequate to describe the physical prop- 
erties of the system. 

We would like to warn the reader against the incorrect understanding of 
the statistical mean in quantum mechanics as a simple sequence of two aver- 
ages; the quantum-mechanical one and the statistical one. To carry out 
quantum-mechanical averaging, according to the general formula, it is neces- 
sary to define the wave function. An open subsystem is in a mixed state and 
possesses no wave function. The averaging in formula (1.1) is carried out over 
the existing but unknown wave function of the closed system (subsystem + 
medium). It is clear that finding the wave function W represents a problem of 
fantastic complexity. 

The significance of formula (1.7) lies in the fact that to find statistical 
means it is not necessary to find either the wave function W of the closed sys- 
tem or to describe in detail the behaviour of the quasi-closed subsystem. 

To find the mean (Z) it is necessary to know only the density matrix p,,,,, 
(or the statistical operator equivalent to it) and the matrix elements L,,,,,. The 
latter are calculated by means of the wave functions of the closed subsystem 
and are determined only by the properties of this subsystem. 

The situation turns out to be very similar to that in classical statistics. In 
order to calculate means it is necessary to define the distribution functions 
and the multiplicity of degeneracy of the states of the subsystem without 
taking into account its interaction with the medium. 

It should be stressed that so far we have by no means specified the proper- 
ties of the quasi-closed subsystem. It can contain a large or a small number of 
particles. Only the fact that the subsystem is in a mixed state is essential. For 
any system in a mixed state the density matrix plays the role of a wave func- 
tion for a system in a pure state. 


§2 STATISTICAL DISTRIBUTION IN QUANTUM STATISTICS 7 
§2. The statistical distribution in quantum statistics 


Let us find the dependence of the statistical operator on time. For this we 
consider the mean value of the operator, È), assuming that the operator È 
does not ey explicitly on time. It follows from (1.8) that 


(= Tr (22 ) A (2.1) 


On the other hand, according to (31.2) of Part V*, we can write 
Ami 
= z Ul. re (2.2) 


A A 
where H is the Hamiltonian of the closed subsystem to which the operator L 
refers. 


Substituting (2.2) into (1.8), we obtain 
(D= i Tr ALA L]=5- (tr (BAL) —Tr (pLAD} = 
= i {Tr (GAL) — Tr (HpL)} =È Tr (ô, IL. 
For the two expressions for È) to be the same, it is necessary to have the re- 
lation 
2-115 Aigh- h. 23) 


The derivative with respect to time of the statistical operator differs in sign 
from that of ordinary operators. 
Formula (2.3) can be written in matrix form as follows: 


dP mn i 
or Fh 2 (mk kn —AimkPkn) - (2.4) 


Formula (2.3) (or the equivalent formula (2.4)) determines the develop- 
ment in time of the system described by the density matrix. It expresses one 
of the most general laws of nature. However, the explicit form of the operator 
Ê (or of the matrix Pmn) is unknown and cannot be determined only from 
eq. (2.3) without using additional information about the properties of the 
subsystem. 

For the present, we shall confine ourselves to the case of stationary states, 


* Volume 1 consists of Parts I and II, Volume 2 of Parts III and IV and Volume 3 of Part 
V: 








8 QUANTUM STATISTICS Ch. 1 


where the density matrix does not change in time. Later on, in kinetics, we 
shall discuss the laws of changes of state in time. 
Under stationary conditions, (2.4) gives 


[A,H]=0 
or 


2 (PmkHkn— AmkPkn) =O: (2.5) 


We see that the statistical operator commutes with the Hamiltonian and is a 
constant of the motion. 

We now go over to the energy representation, where the basis functions Y, 
in expansion (1.2) are the eigenfunctions of the Hamiltonian of the isolated 
subsystem Hy, =£,wW,,. In this case, only the diagonal matrix elements 
Ay, = Enôkn are different from zero. 

It follows from formula (2.5) that in this case 


2 (Pmko knEn —PKn®kmEm) =0. (2.6) 


This relation shows that only the diagonal elements of the density matrix can 
differ from zero. If one returns to the definition (1-6), then this means that 


Chom = Sin leZ (2.7) 

This equality emphasizes once more the particular features of a description 
by means of the density matrix. The existing interaction always brings the 
subsystem into a mixed state and disturbs the interference between states 
which is characteristic of systems in a pure state. 

Relation (2.7) is often called the random-phase condition. Indeed, if the 
coefficients c,,, C depend on time, then the phases of individual coefficients 
are completely uncorrelated. Their averaging gives (2.7)- 

It is convenient to dente the non-zero diagonal elements of the density 


matrix by 





Pnn” Gran SW - (2.8) 
Then the formula for the statistical mean assumes the familiar form 
(y= a ngae (2.9) 
n 


Formula (2.9) shows that the quantities w,, represent the probabilities for the 
subsystem to be found in the nth energy state. 


§2 STATISTICAL DISTRIBUTION IN QUANTUM STATISTICS 9 


Our next problem is to find the probability distribution w,- 

Let us consider, first of all, the case of a closed system. In quantum me- 
chanics there are no closed systems in the literal sense of the word. Every real 
system consisting of atoms undergoes an interaction with its environment. 
This can be, for example, an interaction with an electromagnetic field. More- 
over, in §34 of Part V we have seen that the total energy of a closed system 
cannot have a definite value constant in time. Hence the states of a closed 
system containing a large number of particles can be considered to be mixed. 
The existing interactions, which do not change the energy of the system sub- 
stantially, lead to violation of the interference between the states and to a 
random distribution of phases. The energy of a closed system can be con- 
sidered to lie in an interval ôe; <e; but to constantly vary in this interval. 

The hypothesis that a macroscopic system is in a mixed state is the basis of 
statistical physics. However, to find the explicit form of w, it is necessary to 
put forward another hypothesis. Namely, it should be assumed that all the 
states of a closed system are equally probable. Hence it follows that w, can 
be written in the form of the microcanonical distribution 


Wa = Qe). F (2.10) 


We shall not reproduce here the reasoning of §15 and 16 of Part III, where 
the hypothesis of equal probabilities of states of a closed system was dis- 
cussed in adequate detail. The derivation of the canonical distribution from 
the microcanonical one, given in §16 of Part III, is of general character. 

In this derivation it was assumed only that the quasi-closed subsystem is a 
small part of the closed system described by the microcanonical distribution. 
The discrete character of the energy conditions has already been taken into 
account in the expression for the canonical distribution given in Part III. 

In quantum statistics it is more convenient to write the normalized canoni- 
cal distribution in the form 


_ exp (—€,,/KT) _ exp (—e,,/KT) 
Dexp (—e,,/kT) Z l 





Wn 


(2.11) 


where the summation is carried out over all the states. The partition function 
(statistical sum) is of the form 


Z= 2 w, = 2) Pan= 27 exp (—€,/KT) - (2.12) 


Each degenerate state brings a number of terms equal to the multiplicity of 
the degeneracy into Z. For the statistical operator corresponding to the prob- 





10 QUANTUM STATISTICS Ch. 1 


ability distribution (2.11) we obtain the expression 
p=Z-leHIkT | (2.13) 


A 
Indeed, if W,, is an eigenfunction of the operator H, one can write the equality 





BY, =ZVe-Hlery,, =2-1 (1-7 + 


kT 2(kT)2 
2 
€ G 
: sz (1-5 -.) y, = 
KT x(KT)2 z 


=Z—lexp(—e,/kT) W,,=W,W,- (2.14) 
The mean values and the partition function (statistical sum) can be written 
by means of the operator ĝ as 


(2.15) 


and 
Z=Tre-HAkT . (2.16) 


In calculating the trace, the summation is carried out over all the states of the 


subsystem. 
We note that the operator form of notation of (2.13)—(2.15) does not 


depend on the choice of representation. 
In the same way, without repeating the derivation of §59 of Part III, one 


can consider systems with a variable number of particles. 
The statistical operator in the grand canonical distribution is of the form 


n __eWñ-ÊIkT 
p= Tr eGÂ-ADkT (2.17) 


where n is the operator of the number of particles, and yu is the chemical po- 
tential. 
The grand partition function is 


Z= Tre wi-AyKr (2.18) 
or in matrix form 


Wy = 2 | exp [(un — e)/kT] , (2.19) 


VT, m=% exp [(un - e)/kT] = © 2"Z,, (2.20) 





§2 STATISTICAL DISTRIBUTION IN QUANTUM STATISTICS 11 


where n are the occupation numbers, z is the activity (see §59 of Part IN), 
and Z„ is the partition function (2.12) for a subsystem made up of n particles, 
iie. 


z=eHlkT | Zn” 2 exp(— €;/KT) . 
1 


By means of the grand canonical distribution one can find mean values ac- 
cording to formula (1.18). In particular, the mean value of the number of 
particles in a subsystem 

a A 
Tr (eHh-EDIKT f) a 


PNA. (2.21) 


(N) = Tr pr = Ff o GAÍÐIKT E 





According to §59 of Part III, the equation of state has the form 
pV=kTInZ. (2.22) 
Hence we shall restrict ourselves to the case of an ideal gas confined in a 
container of volume V. x 
In the ideal gas the operator H is just the kinetic energy operator 


Rep Dae (2.23) 
0 2m` 


Correspondingly, the normalized wave functions of free particles are of the 
form 


Wa(r) = V-z eip- r yr. (2.24) 


In a large volume the momenta of the particles can be considered to vary 
practically continuously. 

The total symmetrized gas wave function (see §65 of Part V) is given by 
the general formula 


irn D AUNA 
-a E ala Di ) 2K 1)P |Y (r1) Yp, Cw). (2.25) 





A 
It is obvious that W is an eigenfunction of the operator Ho, i.e. 


Hot = EW , (2.26) 
where £ is the energy of the entire gas 
2 
= Dy Pa 
B= Dy P- (2.27) 


The summation is carried out over all the permutations, the plus sign referring 
to bosons and the minus sign to fermions. 








12 QUANTUM STATISTICS Ch. 1 


According to (2.15) and (2.23), the partition function assumes the form 


Z=Tre ŽA Dye hh y | (2.28) 


a A 

Since W is an eigenfunction of the operator Hp, one can write, analogously to 
(2.14), 

ok y = 9 ET y | 
so that 

Z= 2 e-EKT 2. (2.29) 
The summation is carried out over all the states, i.e. over all possible values of 
the momenta of the gas particles. Since the momenta vary practically contin- 
uously, the summation can be e by integration, writing 


Z AAR e =E 
JÈ (2m ie 
yN 
= mnp J PAT IYI? dp, chs dp3y = 
N 
aoa e (2 D o?/2mkT) IYI? drp - (2.30) 
1 


To calculate the integral the simplest procedure is to write Y for a system 
of two particles. The generalization to N particles can then easily be made. 
Thus we calculate the auxiliary integral 


=i fexp (-4 “ImkT oe) Wp (r) Yp (2) = Yp (1) Yp (F2)!? dp; dp = 


2 2, . . 2 
aa Pi + P3\} i(p, °F, + Patr) py-r +P) $ 
=i) 2 fexp Cir) a CAE na =] dp; dp) = 


Pit P3 
=! y-2 fi ae 
af Srp. ( ar) 


ip; (rz = r4) -ip (nr) _ ip) (r2 =r) ip3' (r-r) 
x [2 +exn(”! Bat) exp (= a) = exp ( ! ii »’) xP dp, dp) = 
p ip, (r, -r,) p? ip, *(r =n) 
j pii] A, mls) 2,52 
= i y-2 [Permin + fexp (e (- EA i) api Sexo (5-425) exp ( h r) dp» + 


2 


2 2 4 
Py pemci) ( -Pi ) (2 a) ] r 
H (ser) & e( TTAR dp, fexp TAA a a 


RTT ni 
= 12am [1 exp (- — =] = 


= V-22QamkT)3 [1 żexp (- (2.31) 


xiy 

{nj a 
Jah 

uD 

| a | 


§2 STATISTICAL DISTRIBUTION IN QUANTUM STATISTICS 13 
with 
a 3 
Ar= ( —— 2 2.32 
T mkT ( ) 


where A7 is the wavelength of a particle of energy KT (‘thermal wavelength’ 
of a particle of mass mm) and r43 = |r} — ry). 
If we denote by fig the expression 





mrik 
Sig = exp { — ) (2.33) 
2 
F 
then (2.31) can be written in the form 
1, = V~2(2amkT)? [1 + fia fz21). (2.34) 


By analogy with (2.34), using (2.33) and substituting the value of W from 
(2.25) into (2.30), we can easily write the general equation for the partition 
function 


1 [2rmkT]}N 
zee [2] Si + D a D aa .. Jar = 


(21h)? i#k#lém 
= l —E)kT : 2 
awl” ar, f \1+ 2 fer ady. (2.35) 


The integration is carried out with respect to the coordinates of all the par- 
ticles. 
We see, first of all, that the expression for the partition function of an 
ideal gas is not the same as the quasi-classical expression 
Z = (=) iN yN 
qu.cl. (2nh)2 N! 
which we used in Part III, §37. 
If, however, the gas density is so small that the following inequality holds 


Ar SF (2.36) 





then it is seen from (2.33) that the quantities fiy turn out to be very small in 
comparison with unity. 
Neglecting the quantities fig in (2.35), we arrive at the equality 


Z= Zouc: (2.37) 











14 QUANTUM STATISTICS Ch. 1 


It is easily seen that for atomic or molecular gases the condition (2.36) is 
much milder than that of neglecting the interaction between the particles. 

Thus for not too dense atomic gases quantum corrections can always be 
disregarded. 

In the case of dense gases these corrections are small in comparison with 
corrections for the fact that the gas is imperfect. Nevertheless, the character 
of quantum corrections to the quasi-classical expression for the partition 
function is of great theoretical interest. 

Assuming the quantities f to be small, we can, obviously, write 


1+ D f2~ [I a f2)=e UDI (2.38) 
i<k i<k 


U(r, T) is by definition equal to 
Ui, T) = — KT in (1 £ f2) = —kT in [1 +exp (—2nlr,— rP NPI. (2-39) 


Substituting expression (2.38) into the partition function, we find 


Z= (N'(2mhy3%)-1 fe-FAT ar, fexp [—Ulrg, DIRT] Py. 
(2.40) 


We see that in our approximation, the quantum correction amounts to the 
appearance of an effective interaction between the particles. This interaction 
is characterized by the potential energy U(r, T) depending on the distance 
between the pairs of particles i and k, as well as on the temperature. This last 
fact shows at once that the function U(rj,, T) can only be considered as the 
effective but not the true potential energy. 

Further, it should be stressed that the effective interaction is a purely 
quantum effect which disappears as fi > 0. 

We have discussed the existence of the effective interaction in §67 of 
Part V. This effective interaction is not associated with the appearance of new 
forces, but is due only to the symmetry of the wave function. In the case of 
bosons, U(r, T) is negative for all values of rj, i.e. the effective interaction 
has the character of an attraction. For fermions the effective interaction rep- 
resents a repulsion, in accordance with what was said in §67 of Part V. 

We now turn to the discussion of condition (2.36). 

If we substitute into it the value of the thermal wavelength A, then we 
can rewrite it in the form 


pth te (E). (2.36') 
(2nmkT): = \N 


§3 STATISTICAL DISTRIBUTIONS IN AN IDEAL GAS 15 


Comparing this with formula (72.16) of Part III, we see that condition (2.36') 
represents none other than the condition for an ideal gas to be considered 
non-degenerate. Thus the criterion of degeneracy can be given an obvious 
meaning: an ideal gas can be considered non-degenerate if the thermal wave- 
length of the particles is small in comparison with the mean distance between 
them. Otherwise there is strong degeneracy. In this case the quantities fj, turn 
out to be of the order of magnitude of unity, and the quantum effects no 
longer have the character of corrections but assume fundamental significance. 
In particular, the effective quantum interaction between particles determines 
to a considerable degree the properties of the system as a whole. We have seen 
this in Part III in the example of a degenerate Fermi gas. To avoid misunder- 
standing, we stress that in this case one can no longer make use of formula 
(2.39), since it was obtained on the basis of inequality (2.36). 

It can be shown that an analogous situation holds in the case of systems of 
interaction particles*. However, for such systems the criterion of applicability 
of (2.36), i.e. the criterion of applicability of the quasi-classical approxima- 
tion in statistical physics, has the form 


Ap<d, (2.41) 


where d is a certain effective distance of interaction between the particles. 
This criterion is fulfilled for gases and condensed systems of atomic and mo- 
lecular particles (except for liquid helium II; see below). 


§3. Statistical distributions in an ideal gas 


In Part III we have already obtained statistical distributions in an ideal gas. 
For what follows, however, we shall need a somewhat different derivation of 
them, based on the grand canonical distribution. In the case of an ideal gas, 
one can write for the grand partition function 


oo co 


2, T,2= D Pb 2o an Do exp (- Z egng/kT) 


n=0 n=0 nk 


= 22n DN exp (— €;/kT) (3.1) 


n=0 ngk 


* K.Huang, Statistical mechanics (Wiley, New York, 1963). 








16 QUANTUM STATISTICS Ch. 1 


since for an ideal gas the energy € of the system is equal to 
e= Deny 5 


where 7, is the number of particles in the kth state. It is obvious that the 
number of particles, ng, obeys the condition 


n= Dny A 
The double summation (over the number of particles in the system and 
over the number of particles in a given state, 7%) is equivalent to a single sum- 
mation over all the independent values of 7g. This is most simply seen in the 
example of the particles of a Fermi gas, for which n, = 0, 1. Obviously, in 


this case we have 
1 


1 
ZV, T, z)= DD 2” DII (exp (— €,/kT))"* = DL (z exp (— €;./kT))"* = 
n=0 nk k k ng=0 


= Ma + zexp(—€/kT)) = [I (1 + exp ((u — €,)/KT)) . 


Analogously for bosons 


ARD D) licen E 
k 


n Ng 


J », (ze VET ») (ze ea/kT V2 ms 


ny n2 


l 
M 1—zexp(—e,/kT) x 1—exp ((H — €q)/KT) ` (32) 


In the case of a Fermi gas 


D (exp (u—e,)/kT)"* = 1 + exp ((u — €, )/KT) . 
nk 


For the Bose gas ng = 0, 1, 2, ..., N, so that 
2 (exp [(u—e,)/kKT])"* = 1 — eHIT 
nk 


The grand partition function (or its logarithm, which is actually needed for 
the calculations) can be written in symmetric form 


+ for Fermi gas 


inZ=2) In(1 exp [(u—e,)/kT]), where ; 
k — for Bose gas 


(3.3) 


§3 STATISTICAL DISTRIBUTIONS IN AN IDEAL GAS 17 


All mean values can be expressed in terms of the grand partition function 
Z. The latter depends on the variables u, V and T. Instead of the chemical 
potential it is always possible to introduce the total number of particles. In- 
deed, mean occupation numbers are given by formula (2.21) 





x kr? l À + for Fermi gas 
nak = > where 
$ ðu exp [(ex—n)/kT] +1 — for Bose gas 
(3.4) 


The chemical potential u is determined by the relation 


= 1 
N= Diig= Dp DTE E3 
which allows one, in principle, to find the dependence of u on T, V and N. 

In the case of the Bose gas, condition (3.5) requires that the chemical po- 
tential u be negative. For the Fermi gas no restrictions are imposed upon the 
values of ų. 

The most obvious thermodynamic characteristic is the equation of state of 
the gas. According to §59 of Part III, the equation of state of the gas can be 
expressed in terms of Z in the form 

+ for Fermi gas 
pV =FkT Din (1 ¥ exp [(u—€,)/KT]) , where . (3.6) 
k —for Bose gas 

We have written all the relations without discussion, since this was given in 
Part III. We recall only that in actual use of the above relations for a macro- 
scopic gas one can pass from summation to integration, writing 

D h?k? 
RE ease Ek= om? 


where g = (2s+1), and s is the spin of the particle. The transition to Boltz- 
mann’s statistics corresponds to the fulfillment of the condition 


eHl/kKT <]. (3.7) 


Substituting the chemical potential of an ideal gas, one can write (3.7) in 
the form of (2.36) or in the form of (72.16) of Part III. 

In Part II we dwelt in detail on the behaviour of an ideal Fermi gas at low 
temperatures. To avoid repetition, we shall now consider only some general 
problems. 

Writing (3.6) in the form 


.4ngVET f2 
In (1 + e%-/kT) = 
E AEL ffi dp In( T)= 








m a a a 


18 QUANTUM STATISTICS Ch. 1 


2m 3 T 
= + (gkT) 27 (Z) V e? de In (1 + e—©)/KT) = 
(20h)? J : o 
3 co 
4n(2m)z eì? de 
= + gk TV ——— ———— =E 8 
: 3(2Th)3 J e(e-WIKT + ] ŠE, Cay 
we see that between the pressure and energy of Fermi and Bose gases there 
exists the same relation as for the classical ideal gas. 
At high temperatures, when condition (3.7) is fulfilled, one can expand 


the integrand in a series and write 


3 (KT)? j5 
pV= eee en/kT f a F e-OVKT) x 
2 0 
$ 
X eh et de = ACHE VUT ukr ( « gait) 69 
(nh) 2 


In this formula the chemical potential u refers to a Boltzmann gas (see (60.9) 
of Part III). Hence (3.9) can be rewritten in the form 


INA 

pv = er (1 + a} , (3.10) 
2gV(mkT)? 

where the minus sign refers to the Fermi gas, and the plus sign to the Bose 


gas. 
The corrections found for the pressure are, as was stressed above, small 


compared with the corrections associated with the interaction between par- 


ticles. 
In the case of fermions, as we have seen in Part III, the effect of repulsion, 


associated with the Pauli exclusion principle, assumes a decisive importance 


for a degenerate gas. 
In the next section we shall discuss the manifestation of the effect of at- 


traction in a degenerate Bose gas. 


§4. Degenerate ideal Bose gas* 


The ‘properties of a degenerate Bose gas differ radically from those of a 
Fermi gas. This difference is associated with the absence of any restrictions 
upon the accumulation of particles in the zero-energy state in the Bose gas. 


* We follow the treatment in the book of K.Huang, Statistical mechanics (Wiley, New 
York, 1963) where the reader can find a number of additional data. 


§3 DEGENERATE IDEAL BOSE GAS 19 


The mean number of particles in the zero-energy state can, according to 
(3.4), be written in the form 


] 


n(0) = ur 


(4.1) 


We see that the value of NR is determined by the behaviour of u(T) as 
T> 0. As we have seen in the preceding section, the chemical potential for 
the Bose gas is always less than zero. At high temperatures it decreases in ab- 
solute value with decreasing temperature. 

The chemical potential is defined by the formula (3.5) for the total num- 
ber of particles. This formula can be written in the form 





ee OO o e E a a 
N=n(0)+N eT + Salata 
teal aC ae 

a e E ares (42) 


The first term represents the mean number of particles in the zero-energy 
state, and the second term the number of particles in all the other states. The 
statistical weight of the zero-energy state is equal to unity (see §35 of 
Part III). 

The second term can be written in the form 


N'=I(u, V, T)= fea : 


oe elew/kT | 


Hence the number of particles in the zero-energy state is 


n(0) =N- Il, V, T). (4.3) 


Formula (4.2) allows one to express the chemical potential in terms of 
temperature and density, while formula (4.3) relates the number of particles 
in the zero-energy state and in states with e #0 to the total number of parti- 
cles. 

We see, first of all, that if the value of the integral / on the right-hand side 
is bounded by a certain value N max then the number of particles in the zero- 
energy state 


n(0) =N- Nmax > 0 


turns out to be finite and to represent a certain fraction of the total number 
of particles. 
Let us consider the quantity /(u, V, T). It is obvious that it can be written 








20 QUANTUM STATISTICS Ch. 1 


in the form 


(2nm)} 2 T ède 
ML A TS A ———— TE n 
(xu, V, T) (nh? mi J e(€—m/KT 


Qnm)ji 2 F > 
2 yí nm)? 2 f pide D elu/kT ę—le/kT = 
(27h)? mi g I=] 
2 


z pam) 2 


lu/kT —le/kKT =} = 
e e eid 
(Qnh)3 nè iF J j 











H 3 = lu/kT 2 i 
=- y EE ety C A (- f) 3 (4.4) 
(27A)? i TB (2nh)3 kT 
Thus the value of / is mainly determined by the sum 
(ee > elu/kT 
r ( E) m A (4.5) 


This sum converges for all values u < 0. The quantity F(— u/kT) represents a 
monotonically decreasing function of its argument, and 


heehee = =~ 
r( E) <M) zi ~2.61. 








Correspondingly, 
i 2.61 V(27nmkT)} 
Ku, V, T) S Nmax = —— >. 4.6 
max OTP (4.6) 
Hence 
acta 2 
FOV = Auta (4.7) 
(2nh)3 
If the equality 
MEN (on ACH (4.8) 
(2nh)3 
or 
1 (2h)? oi 
Ta = = 4.9 
0 (2.61) (2mm) \V G9) 


holds, then n(0) = 0. 


84 DEGENERATE IDEAL BOSE GAS 21 


At T> To there are no particles in the zero-energy state. More precisely, 
the number of particles in this state is 1(0) ~ | and is negligible in comparison 
with the number of particles in excited states (V’ ~ NV). When a temperature 
lower than Tọ is reached, a finite number of particles n(0), representing a 
certain fraction of the total number of particles, appears in the zero-energy 
state. The distribution of particles assumes a qualitatively new form: a finite 
number of particles is in one (zero) energy state, while the remaining particles 
are practically continuously distributed over all excited states. This phenome- 
non is called the Bose—Einstein condensation, and the temperature To is said 
to be the condensation temperature. 

The explicit dependence of n(0) on Tand N/V can be found if the depen- 
dence of the chemical potential on these quantities is established. As we have 
seen, the latter is defined by eq. (4.2). Its solution (for example, graphical) 
shows that for T= Tọ the chemical potential reduces to zero. The curve of 
the dependence (— u/kT) = (T/T) is given in fig. V1.1. 

For temperatures lower than the condensation temperature To, 4 can be 
set equal to zero in eq. (4.2). We then find 


n(0)=N—I(u=0, T, V)=N-N'= 


2.61 V(2amkT)3 _ ( T | 
5 EN WU- iF 4.10 
(2nh)3 (a) ( ) 


Formula (4.10) defines the State distribution of particles for T< Tọ. 
We see that a fraction N(T/To)? of all the particles are in excited states, 





ene 


T/T, 
Fig. VI $ 








22 QUANTUM STATISTICS Chat 


while the remaining particles are condensed in the region of phase space cor- 
responding to zero (or p= 0). 

The analogy with condensation becomes even more obvious if the expres- 
sion for the pressure is determined. 

According to §59 of Part III, we have 
p= in Z=—kTV-! 2 In (1 —e-€-WIkT) = 


=—kTV-! [in (1 —eb/KT) + D ik a —e-e-k7)| = 
(4.11) 


=— KTV- In (1 —ew/kT)—rv-1f dl in (1 —e-€-HI/KT) , 
(27h)? 





The prime means that the term with e = 0 is eliminated from the sum. In this 
formula we can set V °°, since the volume of a macroscopic system is al- 
ways large. Let us consider the regions T> Tọ and T< Tọ. For T> To and 
V >œ the first term reduces to zero. In the second term one can write for u 
the expression (60.9) of Part III for a Boltzmann gas, and we arrive at eq. 
(3.9). For T< Top, as is shown by the analysis of eq. (4.2) for the chemical 
potential, the first term also reduces to zero as V > œ% andy > 0. 
Hence, integrating by parts, we find 
Aas 3 3 A 
_ 4n(2m)? if ezde _ 0.08m:2(kT)? 


(4.12) 
3(2nh)3 g eT] (27h)? 





The pressure at T< Tọ turns out to be independent of density. Only particles 
in states with e #0 give a contribution to the pressure. Since their number is 
~ T?, the total pressure turns out to be proportional to TŻ. 

The isotherms of the Bose gas have the form shown in fig. VI.2, where the 
dependence of the pressure on specific volume is plotted. At the point deter- 
mined by condition (4.8), the specific volume is equal to 


Za Cnty (4.13) 


N  2.61(2nmkT)? 


As can be seen from fig. VI.2, for v < ug the pressure is constant, whereas 
for v > Ug it decreases with increasing specific volume. 

The analogy between the curve of fig. VI.2 and the analogous isotherm 
for a liquid—vapour system is obvious. The region v < vo corresponds to the 
condensed phase, and the region v> ug to the vapour phase. The analogy 
with a phase transition is supplemented by the fact that at the condensation 


vo 


§4 DEGENERATE IDEAL BOSE GAS 23 


A 


| 
WT) v 





Fig. VI.2 


point T = Tọ or v = vp latent heat of transition is released. 

We shall not dwell on the calculations, but shall confine ourselves to two 
remarks. First, in addition to the similarity, it is necessary to stress the differ- 
ence between the processes of condensation of ordinary vapour and the phe- 
nomenon of the Bose condensation. Ordinary condensation and transition in- 
to the liquid state are due to the interaction between molecules. 

The Bose condensation is a process occurring in an ideal gas. Its nature is 
quite different and it is associated only with the quantum effect of the ac- 
cumulation of bosons in the zero-energy state. 

The second remark concerns the problem of the realization of the phenom- 
enon of the Bose condensation. It is clear that since this phenomenon can 
occur only at low temperatures or large densities, the interaction between 
particles, which we have disregarded, must play an important role in its reali- 
zation. 

Liquid helium II, the only liquid which does not crystallize at low temper- 
atures, displays a phase transition (see §77 of Part III). The 4He particles 
have spin zero and obey Bose—Einstein statistics. However, it is not estab- 
lished whether this phase transition in the liquid is related to the phenomenon 
of the Bose—Einstein condensation in an ideal gas. As we shall see in the next 
section, the interaction between particles in an imperfect Bose gas is of basic 
importance. It turns out that the fundamental difference in the behaviour of 
ideal and non-ideal Bose gases is associated with this interaction. Hence, at- 
tempts at interpreting the phase transition in liquid helium as a manifestation 
of the Bose—Einstein condensation are somewhat arguable. Therefore, not- 
withstanding the theoretical importance of the phenomenon of the Bose— 
Einstein condensation, it is not certain that it occurs in nature. 

In conclusion we stress that all the aforesaid referred only to Bose systems 
with a fixed number of particles. The phenomenon of the Bose—Einstein con- 
densation cannot occur in Bose systems in which the number of particles is 
indefinite. Examples of such systems are photons or phonons. It is clear that 








24 QUANTUM STATISTICS Ch. 1 


since the number of particles in such systems has no definite value, one can- 
not apply to them formula (4.2) which was assumed as the basis of the overall 
theory of the Bose—Einstein condensation. 

In this chapter we shall confine ourselves to the consideration of bosons. 
In the chapter devoted to solid-state theory we shall consider the properties 
of Fermi systems in more detail. 


§5. Imperfect Bose gas. Superfluidity 


We now pass on to the consideration of an imperfect gas of bosons. 

We shall assume that a certain sufficiently weak interaction exists between 
the particles of the Bose gas. It may have the character either of an attractive 
or a repulsive force. 

The Hamiltonian of the system of particles in the x-representation is of 
the form 

N 


A hn? 4 
jijs D E +} D Ullr -r ; 
eri vti (ra= Tg) 
where N is the total number of particles in the system, and U(| rą — rl) is the 
energy of interaction between the particles œ and $. For a rarefied gas one 
need take into account only pair interactions. 

For what follows it is convenient to consider the gas to be confined in a 
cube of edge L. Then the Hamiltonian of a free particle 

A h2 2 


Ee V 


will have a discrete spectrum. The momentum of a free particle will have 
components running over a discrete series of values 
_2rħ _ 2mh 2h 
Do pp bee By Fp Go Pz= Tze 


where ny, ny, n, are arbitrary integers, including zero. The eigenfunctions of 


the operator H,, normalized to a volume V = L3, are of the form 
Y(t) = V ell) p <r, 


We now apply the method of second quantization to the system of inter- 
acting particles. The Hamiltonian of a system of particles with pair interac- 


§5 IMPERFECT BOSE GAS 25 


tions is given by formula (99.25) of Part V 


&y,=HytH'. (5.1) 


H= D EPY a, dy, + 12, (ml Ukiy âi âi & on 


The summation is carried out over all discrete values of momenta (both posi- 
tive and negative). 


The energy E(p,;) represents the energy of a free particle 
E(p,)= pz/2m : 


By virtue of what has already been said the energy of interaction U involved 
in the matrix element of the Hamiltonian depends only on the coordinates of 
the two particles. 

To calculate the matrix element use can be made of the wave functions of 
a free particle. This gives 


n = ~(i/h A n 5 b 
UmlUlik)= V 2 fe G/A) (PI T1 + Pm D Ur reti ri +pk Dav dv, > 
V 


Introducing new variables q=r,—ry and R=4(r,; +r) and integrating, 
taking into account that 


V = 
feirn-ray=| for p=0, 


V O for p#0, 
we obtain 
V-ly(p;—p)_ fo + =p-+p, 
muia =] E S A Sel aN) D 
0 for P+ Pm Z Pit Px > 
where 


vp) = f Uda) e~iP aag. 


Integration over all the angles can be carried out directly since U depends 
only on the absolute value of the vector q. This gives 


vp) = f Uciql) a? dq fe=iP -acs Asin @ d0 fay = 


= 4n f Uqa) Ee q? dq . 





We see that the function v(p) is real and that the equality v(p) = v(— p) is 
fulfilled. The operator of the total energy of the system of particles can, in 








26 QUANTUM STATISTICS Ch. 1 


accordance with formulae (5.1) and (5.2), be expressed in the form 


2 
p 
4 PA Fi ay a a 
q= 2 2m ep +4 D) i7 VPI — Pd) Âp pm %pj% px - 6-3) 
In the second term of formula (5.3) the sum is taken over only those 
values of the momenta P;, Px, P, Pm for which the following relation holds 


Pit Pm” Pj t Pk- 


The next problem consists in determining the eigenvalues of the Hamil- 
tonian (5.3), i.e. in reducing the energy matrix to diagonal form. 

Perturbation theory is inapplicable in studying the lowest energy levels. 
This is clearly seen from the fact that in the lower energy states, i.e. for small 
values of the momentum, the kinetic energy of the system tends to zero. On 
the other hand, the energy of interaction has a finite value and is therefore 
large in comparison with the kinetic energy. Therefore a special approximate 
method was developed by N.N. Bogoliubov for the investigation of weakly 
excited states. Let us first consider a system in which there is no interaction. 
In the energy ground state of such a system the momenta of all the particles 
are equal to zero, i.e. 

np =0 for p#0, 
(5.4) 
BE for p=0, 


where 7, is the number of particles with momentum p. It is natural to think 
that in the presence of a weak interaction in the lowest state of the system 
also, most particles will be found with momenta equal to zero. 

In accordance with this we assume that 


ât 


020 =" ~N, 


(5.5) 
aa, =n SN. 


The operators a}, and a satisfy the commutation relation 
a Gl—@i¢.= 
445-449 = 1. 


Since hay =n is large compared with one, we shall disregard the non- 
commutativity of these operators in what follows, i.e. we shall replace the 
operators @, and aj, by ordinary numbers. 

The assumptions (5.4) and (5.5) substantially simplify the expression for 


§5 IMPERFECT BOSE GAS 27 


the interaction Hamiltonian. Namely, in the sum over all momenta, only large 
terms, in which the factors a}, or a are involved in pairs or quadruply, need 
be retained. Such terms are 
AA atat A a 
Mp4p > A 4p Aya 
and so on. On the other hand, terms of the type ai, 
p; # 0, p # 0, are small and can be dropped. 

Thus for H’ we find 


a, any , where P, #0, 


H'= 2 V-lup,—p)at at â â, ~ 
P/+ Py =Pi* Pk Caia abet 2k 


k 1 atata atata a at at a A 
2, V~* v(p) {at baja, ay + aya p44 pt 4yajapa_p ta! pa p%y4g}+ 
+ v(0) ajahaga, ~ 


=~ (0)n? + 2 v-! (p) [2af â pno + ahatny + aa _ pro] ; 


Since the total number of particles in the system, N, is given by 


N=n,+ 2y at 


PZO apa, 


where the second term is small in comparison with the first, we have, in the 
same approximation 


1 a a 
Bes u 4 Cig Pa ap Py 7? 7PK 


~v(0)N?+2N 2 V-!(p) (ata, tât pâ p) + 
p#0 = 


-P 


at pap) - (5.6) 


+N 2) Vp ata 
p#0 p 
The terms dropped are of the order of V3. 
To within a quantity of the order of N, the complete Hamiltonian Has- 
sumes the form 


A 2 
H= R2 | Dt ro ata, +4N2V-1 (0) + 





28 QUANTUM STATISTICS Ch. 1 
+iwv-! 2) wpyatat, +4nv-! DD âp p- (5.7) 
p#0 p#0 PSP, 


Ag A 
We now introduce new operators bh and bp which we define in the following 
way: 





“te + at 
j= ee bi = ofp à (5.8) 
(ng)! (ng)! 


A A . . 
The operators bp and bi defined in this way satisfy the same commutation 
relations as the operators ĉp and âl, since we consider @ and aj, to be ordi- 
nary numbers. Furthermore, as is easily seen, 


Bh by = aay = np, (5.9) 


because ahay= No: By means of operators bp and bi the Hamiltonian (5.7) 
can be written in the form 


A 2 
= PY fih 4 la2yp-l 
H 2, E- bj bpt 4N? l 0) + 


+4n,v—! 2, vp) (6151, +b, b_, + 261b] - (5.10) 


In order to bring the Hamiltonian (5.10) to diagonal form we carry out linear 
transformations to new operators E and Ei by the substitutions 


A = x et iT b tt a 
bp uptpt Ost > bi upėn t Of A (5.11) 
where 
2 2- e = 0 = 
up- Yp” 1 Up= u_pS U_=U_p 


If Up and up are real functions of the momentum p, then the new Bose opera- 
tors £ and £} satisfy the commutation relations (99.9) of Part V. Substituting 
(5.1 1) into (5.10) and requiring that the coefficients of operators of the type 
eter and Entp reduce to zero, we find the functions upand Up. A simple but 
somewhat lengthy calculation gives 


maa= a a 
-Ap P a-a 


§5 IMPERFECT BOSE GAS 29 


V $ p? no 
Ap= ngv(P) lo m y vo) , 


[PW på \3 
e(P)= =i $ E 


Then only diagonal terms are preserved in the operator H, and it is brought to 
the form 





(5.12) 


H=Hy+ 2 ep) EE, . (5.13) 
The eigenvalues of the Hamiltonian (5.13) are 

E=Eg+ È pn"), (5.14) 
where i 

Eg =}N2?V-1 (0) + D: [<w = Pe -20 np) ; (5.15) 


and the n'(p) are integers. 


One can also find the total momentum of a system of particles. It turns 
out to be equal to 


P= ata,= 2upata,. 
27, Papp p Pep ’p 


If in this equality we pass from the operators bi, and by to the operators Ft 
and Ep we obtain 


P= Dpélt,= Vpro. (5.16) 


Formulae (5.14) and (5.16) have a simple corpuscular interpretation. We see 
that the energy of the system is written in the form of the sum of two terms. 
The first term £g represents the energy of the ground (lowest) state. The 
second term can be interpreted as the energy of quasi-particles which repre- 
sent the collective excitations of the system, np is the number of elementary 
excitations in the state with momentum p. The energy of each elementary 
excitation is equal to e(p) (5.12). 

For small momenta the energy of excitation can be written in the form 


e(p) = (29) Ipt. (5.17) 


Here Vo= V/ng ~ V/N represents the volume per particle. It follows from 








30 QUANTUM STATISTICS Ch. 1 


formula (5.17) that the following inequality must be fulfilled 
“0)= [U(q)da>0, (5.18) 


which means that repulsive forces are predominant. If, on the contrary, 
v(0) < 0, the energy turns out to be imaginary, which corresponds to an un- 
stable state of the system. 

For large momenta the energy is of the form 

=P ,»(P) l 

e(p) = P, om (5.1 9) 
It should be noted that the existence of elementary excitations represents a 
collective effect of the entire system. Each excitation is associated with the 
state of the system as a whole, and not with the state of an individual particle. 
An ideal excitation gas, characterized by the Hamiltonian (5.13), obeys Bose 
statistics since the operators it and Ep satisfy the commutation relations 
(99.9) of Part V. 

It is easily shown that the system considered possesses the property of 
superfluidity. Suppose that the whole set of particles acquires an additional 
velocity v with respect to a certain system at rest, for example with respect to 
the walls of the tube or container containing the imperfect Bose gas. Then 
one can assume that all the results obtained will be valid for a system moving 
with velocity v with respect to a system at rest. If the energy in the moving 
system is equal to Æ”, and that in the system at rest is equal to Æ, then the 
following relation exists between them 


E°=E+4Nmv2 +v-P, (5.20) 


where P is the total momentum of the Bose gas in the system at rest. 
Using expressions (5.14) and (5.16), we obtain 


E¥=E,+4Nmv2+ 27n'(p){e(p) + vp}. (5.21) 


To decelerate the whole set of particles, there must arise excitations with 
momenta directed against the velocity v. The energy gain when one of these 
excitations appears is equal to 


Ae = e(p) — Ivl Ipl . (5.22) 


If the energy gain of the system is Ae > 0, the appearance of excitations is 
energetically unfavourable. This means that the system will move with the 
velocity v for an indefinitely long time without the appearance of such excit- 
ations. There is no deceleration in the system, and it possesses the property of 
superfluidity. Let us now formulate the condition for superfluidity. For the 


§5 IMPERFECT BOSE GAS 31 


quantity Ae to be positive it is necessary that the following inequality be ful- 
filled 

e(P) > IVI 

Ip| 
for any p. We denote the minimum value of the ratio e(p)/|p| by v”. Then 
superfluidity occurs in motion with a velocity v < vř. Consequently, super- 
fluidity is possible in the system if the following inequality is fulfilled: 


Iv"| = min (i) >0. (5.23) 
It follows from the expression (5.12) for e(p) that 
* (2) (eee _P }) ee 
v = |>= = KASA Dat. OS 
IPI “p+ 0 Vm 4m? Reto mV 


If (0) > 0, then v* is real and positive. Thus in the case of the predominance 
of repulsive forces in an imperfect Bose gas there exists a real and positive 
value of v, and the Bose gas possesses the property of superfluidity. 

In the case where there is no interaction between the particles (ideal Bose 
gas), U(q) = 0, v(p) = 0 and the energy of the excitations is given by the for- 
mula 


e(p) = p?/2m. (5.24) 


Then v* = 0, so that an ideal Bose gas possesses no superfluidity. 

Thus we see that the property of superfluidity manifests itself in an imper- 
fect Bose gas and is absent in an ideal Bose gas. Superfluidity is not associated 
with the specific properties of the system of bosons. For its existence a partic- 
ular form of the energy spectrum of the collective excitations of the system is 
required. Namely, according to (5.23), it is necessary that the ratio of the 
minimum excitation energy of the system as a whole to the momentum of 
this excitation has a finite value. 

The spectrum of excitations of a gas of non-interacting bosons has the 
form (5.24) for small excitations and does not satisfy (5.23). 

The spectrum of excitations of a gas of interacting bosons satisfies condi- 
tion (5.23) only in the presence of repulsive forces between the particles. At- 
tractive forces do not lead to superfluidity for a system of interacting bosons. 
However, it should be stressed that in obtaining this result use was made of a 
property of bosons: we have assumed that the bulk of interacting particles is 
in the state with zero momentum, i.e. that it forms a condensate in momen- 
tum space. This is a necessary condition for the appearance of superfluidity in 
a Bose gas; however, this condition is not sufficient since the condensate also 





32 QUANTUM STATISTICS Ch. 1 


forms in an ideal Bose gas but there is no superfluidity. The difference be- 
tween the condensates of ideal and imperfect Bose gases is seen from the fol- 
lowing reasoning. Let an ideal Bose gas move as a whole with a certain veloci- 
ty v with respect to the walls of the container. If one of the particles is 
stopped as a result of an interaction with a wall of the container, the remain- 
ing gas particles will continue to move with a lower kinetic energy. Repetition 
of this process will in the end decelerate the gas. 

The situation is different in the case of a gas whose particles undergo mu- 
tual repulsive forces. Stopping individual particles is in this case impossible. 
Interaction with the wall would excite the system as a whole. For a velocity 
of motion v < vř this turns out not to be possible. 

It appears natural to apply this conclusion, obtained for an imperfect Bose 
gas, to a Bose fluid. Although the formal assumption of the pair character of 
the interactions is not fulfilled in a fluid, the qualitative discussion of the 
spectrum of small excitations of the collective motion is also applicable to a 
Bose fluid*. 


* See K. Brueckner and K. Sawada, Phys. Rev. 106 (1957) 1117, 1128. 





Physical Kinetics 


§6. Phenomenological transport equations and the general equations of 
kinetics 


The study of the behaviour of macroscopic systems which are not in a 
state of total thermodynamic equilibrium is as important in modern physics 
as consideration of the equilibrium states of macroscopic systems. Processes 
occurring in systems which are not in a state of total thermodynamic equili- 
brium are irreversible. The study of irreversible processes is the purpose and 
content of physical kinetics. 

It is clear that the properties of non-equilibrium macroscopic systems, 
which are subjected to external actions and which develop in time, are im- 
measurably more complex than those of equilibrium systems. This can be 
seen from the infinite diversity of external actions which can violate the equil- 
ibrium in a system. 

Hence it is not surprising that at present, physical kinetics has not reached 
in its development that degree of completeness and universality which is 
characteristic of statistical physics. 

We need, first of all, to formulate certain general laws concerning changes 
of state of macroscopic systems which are not in an equilibrium state. 

Two approaches to the solution of the problem are possible; a quasi- 
macroscopic one and a kinetic one. 


33 





34 PHYSICAL KINETICS Ch. 2 


In the quasi-macroscopic treatment the state of the system is defined by 
certain macroscopic parameters; temperature, concentration and so on. One 
seeks the laws of change of these parameters in space and time, when the sys- 
tem is not in an equilibrium state but is subjected to the action of external 
forces. In this case it is assumed that changes in the states of a macroscopic 
system can be characterized by the probability of transition from one macro- 
scopic state into another. In other words, it is assumed that the concept of 
probable change of the macroscopic parameters characterizing a system can 
be introduced. 

In the kinetic approach to the solution of the problems of physical kin- 
etics it is assumed that the system can be described by a certain distribution 
function depending, in general, on the generalized coordinates and momenta 
of the particles. Assuming the probabilities of transition from one microstate 
to another (i.e. of the transition of a figurative point from one phase-space 
cell to another) to be defined, one works out the law of change of the distri- 
bution function, the so-called kinetic equation. The solution of this allows 
one to find the distribution function of the non-equilibrium system. It is 
clear that the kinetic approach is more detailed. In particular, it makes it pos- 
sible to calculate quantities such as the kinetic coefficients; thermal conduc- 
tivity, diffusion coefficient and so on. 

Deferring the discussion of the kinetic approach to the study of non- 
equilibrium systems, we shall first discuss the quasi-macroscopic method. It 
is clear that it has a limited region of application: if a system is not in equili- 
brium, its state cannot, in general, be characterized by means of macroscopic 
parameters; for example, one cannot speak of the temperature or pressure of 
a body which is not in an equilibrium state. These concepts, as we have al- 
ready seen, have a definite meaning only for a system placed in a reservoir. 
Nevertheless, in a number of cases an important property of macroscopic sys- 
tems which we have already discussed, in §24 and §25 of Part III, allows us 
to make use of macroscopic quantities for describing the states of non- 
equilibrium systems. 

We have seen that in a system which is not in an equilibrium state and is 
left to itself the phenomenon of relaxation occurs. After the lapse of the 
relaxation time the system goes over into an equilibrium state. In a real 
macroscopic system, the approach to an equilibrium state is very frequently 
accompanied by a number of processes. We shall illustrate this by a simple ex- 
ample. n 

Let two gases whose particles are of substantially different masses, for ex- 
ample gases of ions and electrons, exist in a container. The following sequence 


of processes occurs in the transition of the system into an equilibrium state: 
% 


§6 PHENOMENOLOGICAL TRANSPORT EQUATIONS 35 


(1) the gases in mixing fill the container uniformly; (2) statistical equilibrium 
is established in the gases, and then, because the mass difference hinders the 
transfer of momentum in collisions between the particles of the different 
gases, only collisions between identical particles are effective, as a result of 
which after the lapse of relaxation times 7, and 7 respectively, a Maxwell dis- 
tribution is established in each of the gases separately; (3) as a result of mo- 
mentum transfer in collisions between particles of different mass, after the 
lapse of a considerably longer relaxation time 73, a common Maxwell distribu- 
tion will be established in the mixture of gases; (4) capture of electrons by 
ions, accompanied by the emission of the corresponding energy, will, under 
certain conditions, lead to the transition of the gas into a neutral equilibrium 
state (time 74). Thus the approach of the gas to total equilibrium proceeds in 
several stages having relaxation times which differ substantially from each 
other. 

The properties mentioned are possessed by a very large class of physical 
systems, in particular by any systems which consist of small parts. Since the 
relaxation time increases rapidly with increasing size of the system, equili- 
brium is established more rapidly inside the parts than between the parts (see 
§25 of Part II). 

In practice, the study of systems in which an equilibrium with respect to 
fast processes is established during the relaxation time Tf but in which an 
equilibrium for slowly varying parameters does not have time to be estab- 
lished, is often of great interest. Such systems are said to be in a state of in- 
complete equilibrium, or are quasi-equilibrium systems. 

The mixture considered above is in incomplete equilibrium. If the time ¢ 
which has elapsed since the moment of mixing satisfies the inequality 


71,72 <t €73, 


each of the parts in equilibrium, the electron gas and the ion gas, can be 
characterized by ordinary macroscopic parameters. However, as distinct from 
equilibrium systems, these parameters will vary slowly in time. 

If the same gas mixture is not isolated, but is in an external force field, for 
example in an electric field, then its state will change under the action of the 
field. For a field constant in time, and for f>7,, T3 the two systems can also 
be assumed to be independent of time, but to be non-equilibrium systems. In- 
deed, a systematic motion (flux) of charges in the direction of the field arises 
in this system. 

A time-independent non-equilibrium state of a system is called a stationary 
state. In such a state, as well as in a state of incomplete equilibrium, the sys- 
tem can be characterized by the values of macroscopic parameters. 





36 PHYSICAL KINETICS Ch. 2 


It follows from what has been said that a quasi-macroscopic consideration 
of non-equilibrium systems is possible in the case where the processes occur- 
ring in them are slow. 

A disadvantage of the reasoning given is the relative character of the no- 
tions of fast and slow processes. For actual systems, one is often able to carry 
out this division sufficiently clearly. It is obvious that a necessary condition 
for such a division is the requirement that Tow > Tfagt- 

If the system is in a state of incomplete equilibrium, the thermodynamic 
concepts such as temperature, thermodynamic potentials and so on are appli- 
cable to it or to its macroscopic parts. 

For example, one can speak of a system with pressure and temperature 
varying from point to point or varying in time. 

In a macroscopic system which is not in equilibrium, displacement of its 
parts and mass transport often occur. However, in this case these parts can be 
characterized by thermodynamic parameters; density, pressure and tempera- 
ture. 

In what follows we shall formulate empirical laws of transport in such non- 
equilibrium systems, and then we shall go on to the derivation of theoretical 

elations characterizing the behaviour of non-equilibrium systems. 


§7. The mass conservation law and diffusion flow 


As we have already pointed out in §6, the violation of equilibrium in a 
system is often associated with a macroscopic motion of its parts. Let us 
divide our macroscopic system into small but still macroscopic elements, and 
let us assume that these elements are in a state of local equilibrium. This 
means that each element can be assigned ordinary thermodynamic character- 
istics; definite temperature, mean density and thermodynamic potentials. 

We shall see below that such local equilibrium for small elements is estab- 
lished extremely rapidly in the simplest system (the ideal gas). A departure 
from the equilibrium state of the system as a whole, in particular mechanical 
motion of its parts, does not violate the local equilibrium in small elements. 
By making the assumption of local equilibrium in small elements and of non- 
equilibrium of the system as a whole, i.e. considering the system to be in in- 
complete equilibrium, one can formulate general laws of change of state of 
such a system. It is then necessary to take into account its internal motion. 

We shall abstract the molecular structure of the system and assume it to be 
a continuous medium. This means that the velocity of displacement V(r, £), 
which is a continuous function of coordinates r and time ¢, is considered to 
be defined at each point of the system. 


§7 MASS CONSERVATION LAW AND DIFFUSION FLOW 37 


A continuous medium can be a liquid, a gas or a solid body, flowing under 
the action of applied forces. However, one deals most often with incompress- 
ible fluids and gases. Accordingly, the density, pressure, temperature and other 
thermodynamic characteristics of the medium will be assumed to be continu- 
ous functions of coordinates and time. In order to avoid misunderstanding, 
we stress that the dependence of thermodynamic quantities on coordinates 
and time is to be understood as a variation of local equilibrium characteris- 
tics. For example, the energy Æ of a certain small (but macroscopic) element 
of a system changes when it is displaced from a position with pressure p} and 
temperature T} to another position with pressure pọ and temperature 7. 
However, at each position the relation between Æ, p and T has the equilibrium 
thermodynamic character. 

The continuous medium approximation corresponds to a thermodynamic 
description of equilibrium systems. In this approximation we shall formulate 
general laws determining the motion of the medium, and mass transport and 
energy transfer in it. In this case, naturally, we shall have to make use of cer- 
tain empirical relations. The latter will not have the general character which is 
possessed by the empirical laws (the first and second) of thermodynamics, be- 
cause, as we have already stressed, non-equilibrium systems display a great 
diversity of properties depending on their actual structure and on the charac- 
ter of the processes. 

One of the most important problems of macroscopic kinetics is the deriva- 
tion of the transport equations for a continuous medium and finding the con- 
stants, called the kinetic coefficients, involved in these equations. 

Such general transport equations are those of mass, momentum, energy 
and entropy transfer. Let us first formulate the mass transport law. In a sys- 
tem of n components, the law of conservation of mass of each of the compo- 
nents holds. If the mean macroscopic velocity of the ath component is de- 
noted by vœ and its mass per unit volume by pœ then the mass conservation 
law can be written in the form 

ð 

ŽV (Pava) =0. (7.1) 
Here it is assumed that no chemical reaction takes place between the compo- 
nents. In the more general case, one should write on the right-hand side of 
(7.1) the rate of appearance or disappearance of the particles of the ath 
component per unit volume. Eq. (7.1) is just the mass transport equation. 

The motion of a system is more conveniently characterized by the velocity 
of the centre öf mass rather than by the velocities of the components. By 
definition, the velocity of the centre of mass V is equal to 





38 PHYSICAL KINETICS Ch. 2 


Po. v 
ga ia D Sata e (7.2) 
Dopa 
where p is the total density in the system. 
Summing (7.1) over all the components, we easily find the relation be- 
tween p and V: 


90 . ¥-(py)= 
Sr + V-y)=0. (1.3) 


This is the so-called hydrodynamic equation of continuity, expressing the 
law of conservation of mass of the entire system. If the total density of the 
medium is constant, then dp/dt = 0 and instead of (7.3) one can write 


V-v=0. (7.4) 


Media with constant density are said to be incompressible. In hydrodynamics 
it turns out that if the velocity of a continuous medium (a liquid or a gas) is 
small in comparison with the velocity of sound, such a medium can be con- 


sidered incompressible. 
We rewrite (7.1) in the form 


Op, 
mt Wa V)Pa=—Pal V-V)— V-(eVa—V)) 


or 


dp A 
a ALA) Vija (7.5) 


where Ja denotes the so-called diffusion flux of the ath component, 


WE QLD) Q (7.6) 


Formula (7.5) expresses the law of conservation of mass of the ath compo- 
nent. The vector j, shows to what degree the motion of particles of the ath 
component differs from the mean velocity of motion of the system as a 
whole. The law of conservation of mass of the ath component is usually re- 
written by introducing the mass concentration 


Cu = PalP - (7.7) 


Then instead of (7.5), taking into account (7.6), we obtain 


dey i , 
—“=--(Vv-j,). (7.8) 


§7 MASS CONSERVATION LAW AND DIFFUSION FLOW 39 


It is obvious that in a system of n components, n — 1 concentrations and 
n—1 diffusion fluxes are independent, since 


evel: DD O (7.9) 


Of course, in a one-component liquid there is no diffusion flow. In this case, 
the law of conservation of mass is expressed by formula (7.3). 

For simplicity of notation, we shall consider a two-component system, in 
which one concentration and one diffusion flux are independent. 

In the absence of external fields, equilibrium conditions are the constancy 
of the chemical potential and of the temperature in the system (see Part III). 
Diffusion flow in equilibrium is equal to zero. 

We now consider a system in a non-equilibrium state. For sufficiently 
small departures of the system from the equilibrium state, it is natural to as- 
sume that the diffusion flux which arises is proportional to the gradient of 
the chemical potential u: 


j=—-yVu. (7.10) 


We assume that y > 0. Here the minus sign shows that the flow which arises is 
directed from a higher towards a lower chemical potential. This corresponds 
to the requirement of a minimum chemical potential in the equilibrium state. 
Formula (7.10) represents, in essence, the first term of the expansion in 
powers of the quantity Vy, and is meaningless for large departures from the 
equilibrium state. 
In this section we shall confine ourselves to the case of an isothermal sys- 
tem. 
For an isothermal system one can write 
(M) ðu ðu kn 
a == 1 (32) pr a7 (5p) ig, PPP aa E VP (TAL) 
where two new coefficients are introduced: the molecular diffusion coeffi- 
cient 


m X (ee (1.12) 
E A Ne pT 


and the barodiffusion coefficient 
IA YWOu/OP) 0. 7 


7.13 
By (7.13) 








40 PHYSICAL KINETICS Ch. 2 


As a rule, the barodiffusion coefficient is very small, and the second term of 
(7.11) can be neglected. Then the diffusion flux assumes the form 

-(M) = 

OD = — pD, Vcg - (7.14) 
Instead of the mass concentration c,, use is often made of the number of 
particles per unit volume, and the diffusion flux is related to the number of 
particles rather than to the mass. In this case 


jo=—DaVCq- (7.15) 


Formulae (7.11), (7.14) and (7.15) represent the well-known empirical law of 
diffusion (Fick’s law). 

We shall see below that for non-isothermal systems the law of diffusion 
must be generalized somewhat. In what follows we shall see also that for the 
case of an ideal gas the diffusion law can be derived theoretically from the 
general laws of physical kinetics. 

Taking into account (7.11), the mass conservation law (7.8) is written in 
the form 


de 
Vv 
“= 2 ay. (+P 
Gr DV Cat kV (=) (7.16) 
or, disregarding barodiffusion, 
dc, J 
Tye (v-Ve,)=D,V Cy + (7.17) 


The last expression is called the equation of convective diffusion. It des- 
cribes convective mass transport in a moving medium, as well as molecular 
diffusion. 


§8. The momentum conservation law and the equations of motion for a con- 
tinuous medium 


In the preceding section we have considered the motion of a continuous 
medium to be defined. We now consider how to find the equations of motion 
of such a medium under the action of external forces applied to the system. 

For this we write the equations of motion of a small (but finite) volume 
element dV of the medium in the form 


£ fovav=f Fav +f Fas, (8.1) 


§8 MOMENTUM CONSERVATION LAW 41 


where F is the force per unit volume, and F™) is the surface force per unit 
area with which the surrounding medium acts upon a given volume. The vec- 
tor F™) is called the stress. To define the stress, it is necessary to indicate the 
direction of the vector, n, normal to the surface. 

We introduce an elementary parallelopiped bounded by the planes n} = 1, 
ny =j, n3 =k. The vector F has, obviously, the components Fx, icy Mos. 
each of which represents a component of the force acting upon an area of 
1 cm? with normal i. The vector components FÜ and F‘*) have analogous 
meanings. 

We can introduce the stress tensor oj, defining it by 

Fi, = ne . (8.2) 
It is obvious that the components of the tensor o; are the same as those of 
F), We transform the surface integral into a volume integral by the Gauss— 
Ostrogradsky formula 


$F as = founds = Sar. 


We then have finally * 





d ~ dou; ka ðO ik 
Sorf og av=f rN S (8.3) 
Since the volume V is arbitrary, it follows from (8.3) that 
dv; dO ik 
p A aA 8.4 
Par Fit Bx, Cx) 
or, if the external volume forces are conservative, then 
du; du; dv; au. dI ik 
peas i : )--2 = = ‘5 
dt e(z + Vk Fxg Tp (E ) 


where U is the potential energy per unit RES Eqs. (8.4) and (8.5) express 
the law of motion of a continuous medium. 


Eq. (8.5) can be rewritten in another form by slightly transforming its left- 
hand side. 


Namely, making use of the equation of continuity, we can write 


ðv; lv)  əp_əð a 
oy y Van oF) ition Bom 


* It is evident that fuj(dp/dr) dV = 0 





42 PHYSICAL KINETICS Ch. 2 


Hence, substituting for pdu,/dr from (8.5), we have 


2 pvj= — z (pup — Tip) — ae (8.6) 
The left-hand side of (8.6) represents the rate of change of momentum per 
unit volume. The first term on the right-hand side represents the momentum 
flow density j= pv;v, — Oig- Thus (8.6) expresses the momentum conserva- 
tion law. The stress tensor o, can be related to the velocities on the basis of 
experimental data. 

For the equilibrium state, when the velocity of motion of the medium re- 
duces to zero, one can write the condition of thermodynamic mechanical 


equilibrium: 
Zor 8.7 
Fi= 5" (8.7) 
Comparing (8.2) and (8.7), we arrive at the conclusion that for a liquid at 
rest 
Oik = — Põik - (8.8) 


As shown by experiment, irreversible processes arise in moving continuous 
media. It is impossible to write an expression for oj valid for all moving 
media, liquids, gases and solid bodies, for any mode of flow. 

We shall confine ourselves to the most fully investigated (although not in 
practice the most frequent) case of the so-called Newtonian fluids. For New- 
tonian fluids the stress tensor is a linear function of the velocity gradient. 

When there is lamellar flow between two rigid walls, one of which is at 
rest and the other moving with velocity v, the surface of the solid body is 
acted upon by the stress o% =nv/L, where L is the distance between the 
walls. It is easily seen that under these conditions oy, = const. Consequently, 
the same stress acts upon each cm? of an imaginary plane in the fluid. For an 
arbitrary law of motion, one can write a general expression for o., proceeding 
from the following two requirements: 

(1) ox is a linear function of the derivatives dv,/0x, or dv,;./dx; (i, k = 
17253)! 

(2) For uniform rotation of the fluid as a whole, o,, reduces to its statis- 
tical expression, since no relative displacement of fluid layers occurs. 

The combinations of derivatives which satisfy this condition are 


Ov; Ou, dv; 
oyna (5+) B (8.9) 


§8 MOMENTUM CONSERVATION LAW 43 


It can be seen by a direct check that for v = œ X r, where œ is the angular 
velocity, these combinations of derivatives reduce to zero. Hence in the most 
general case one can write 


ðv; Ska dv; Ls 


on= Poy talte) + dx; 


This expression is usually written in the identical form 





ðv; vk ðv; 2 
Ce Pet Tae ar, 35 ix a; +e ay = — põik t Oik» 
where 
n=a, (8.10) 
and 
¢-4n=B. (8.11) 


The kinetic coefficients n and ¢ are called the first (or shear) and second (or 
bulk) coefficients of viscosity. 

For an incompressible fluid dv,/Ox; = 0, and the stress tensor assumes the 
form 


a 6 
zi ot) (8.12) 


Ge = POR tN (Fis dx 


Newton’s law of viscosity holds for gases and certain liquids (above all for 
water). 
The viscosity is a function of temperature 


T? for gases 
d (8.13) 


e-/T for liquids ` 
It will be shown below that the expression for the stress tensor (8.12) as well 
as for the temperature dependence of viscosity in the case of gases can be 
obtained theoretically. 


Making use of the expression for o,,, one can write the equation of motion 
of an incompressible fluid in the form 


ðv; ðv; op, a Ov; dv, 
o(5e t UK aS OX; t) = Fj- ax, > irae (ee Ox; es) 


or 








44 PHYSICAL KINETICS Ch. 2 





o(sis ðv; <4) Op; dv; Ga 
+o, Sr) . 14) 
or ðX k Ro ax? 
This last equation is called the Navier—Stokes equation. 
The Navier—Stokes equation can be rewritten in the form 


ðv; OU; 
Zio =a | UV, + PS ip — nse =) + F; 
or 
Z (ov y= a UE 
where 
ðv; dvg 
Tl = pupdy + pd ik — n( + Fer) = Pure (8.15) 


Integrating both sides of (8.15) over the volume and transforming the integral 
into a surface integral, we have 


ə 
57 feria = -$ Migas- (8.16) 


Clearly, this formula expresses the momentum conservation law: the change 
of momentum in a given volume of the fluid is equal to the momentum flux 
expressed in terms of the surface which bounds it. The tensor Il, represents 
the momentum flux density tensor for an incompressible fluid. 


§9. The energy conservation law and entropy transport in a moving contin- 
uous medium 


The energy of a moving continuous medium is made up of its internal ener- 
gy E and kinetic energy 3pv2. In this case we relate the corresponding quanti- 
ties to unit mass of the fluid and do not take into account the potential ener- 
gy in an external force field*. 

The energy conservation law can be written in integral or differential form 


£ fee +4ov av =-§$jMas, (9.1) 


* See S.R.de Groot and P.Mazur, Non-equilibrium thermodynamics (North-Holland 
Publishing Company, Amsterdam, 1962). 


§9 ENERGY CONSERVATION LAW 45 


ŽE + 4pv2) = — Vj) | (9.2) 


where jæ is the energy flux density. 

We first of all find the derivative d(4pv2)/dr, making use of eq. (8.6). Mul- 
tiplying this equation by v;, we have 
ð 


Di 
ape = 





z ð x 
sel oxe (Pv Pe- Ox) = 
du; 


t 


ð 
27 ax, PV Ve— Opt) BEEF (9.3) 


This equation has an obvious interpretation. The first term in the first part 
represents the kinetic energy flux density. It consists of directly transferred 
kinetic energy 4pv?v and the flux associated with mechanical work 0j,0; done 
on the medium. The last term, as can be seen from what follows, involves the 
energy dissipation due to viscosity. 

For an incompressible fluid, one can write, according to (8.12) 








ðv; ðv; ( dv; a) ðv; _ 
Cn Ge, Pom ax, A ax, Ox; xg 
Ov, dv; vk)? = OU, ji 
= 1 =l ee 
P mel =) Tax, Ox, ) CE) 


For an incompressible fluid, eq. (9.3) assumes the form 





ora ee Ov; dvg a ( 2% , WK \? 
are [Qo +p), —m (t ax; )| insect =) .(9.S) 


We see that the change of kinetic energy in a given volume is associated 
with the energy flow from this volume and with a quantity which does not 
have the character of flow and which is proportional to the viscosity. It is 
clear that the latter represents an energy dissipation. Since the dissipation is 
associated with a decrease in kinetic energy, we always have n > 0. 

Returning to the equation of total energy conservation (9.2), we transform 
it in such a way as to obtain the law of change of internal energy. 

Subtracting (9.3) from (9.2), we find 


:(£) 
Dagny Qe dupi fi 
ar (pE) = ax pv», — 0,0) + On EF. D (9.6) 


The total energy flux density, j®), in an isothermal fluid is by definition 








46 PHYSICAL KINETICS Ghz 


made up of the total energy flux transferred by the fluid and of the energy 
flux due to mechanical work: 


GE isotherm = (EP + 4007) vp — Opt; - (9.7) 


For v = 0, i.e. in a medium at rest, there is no energy flow. 

However, if the fluid is not isothermal, then an energy flow arises in it 
even if it is at rest. Since the condition of thermodynamic equilibrium is the 
condition of constancy of temperature, for sufficiently small departures from 
equilibrium it is natural to set this energy flux equal to 


na KVT, (9.8) 
where x is a kinetic coefficient called the thermal conductivity. The vector jy 
is called the heat flux density. Since the vector jis oriented in the direction 


of decreasing temperature, we always have K > 0. 
Of course, the law of transfer (9.7) is empirical and holds only for small 


departures from thermal equilibrium. 
In the general case of a non-isothermal medium, the total energy flux den- 


sity vector can be written in the form 


: oT 2 
jfP=—-K ory Ele )Vk— Tik?k - 0-9) 


Substituting pee from (9.9) into (9.6), we obtain 


apE a ə ƏT ðv; 
Ceo L ee \ hop ——: 9. 
ay pe Oe (« ax, ) + Oik ax, G10 


Like the kinetic energy, the internal energy of a fluid is not conserved. A 
conservation law holds only for the sum of these, i.e. for the total energy. 
The last formula is conveniently written in another form. That is, one can 


write 


p = 2 (pk) + V-(pEV). (9.11) 


Furthermore, we define the heat content in unit volume by the relation 


dO is a: 
Das Nidhi (9.12) 


On separating the term with pressure in the stress tensor o; by writing 
Oik = Oik — PSK > (9.13) 
we have (using (9.8), (9.10), (9.11) and (9.12)) 


§9 ENERGY CONSERVATION LAW 47 


dE_ dQ _ % dv; 


Par Pat P oxp hax, 





Dividing all the equations by p, we obtain 


dE_dQ_ p % | Sm 2i 








dt dt pdx, p dxxz (9.14) 
The continuity equation gives 
oo a 9.15 
diet Lox axe (9.15) 
Hence, finally, 
On du; 
dé _ dQ , p do, ik i (9.16) 





dt dt 2 dt p ax,” 
As we have stressed in §1, we have adopted the assumption of the exist- 
ence of local equilibrium in a moving medium. Therefore the relation be- 
tween thermodynamic functions, in particular between internal energy and 
entropy, is given by the formulae of thermodynamics. 
We write the basic thermodynamic equality in the form 


de, 
die ata agdh D ieadh 
and eliminate internal energy from (9.17) and (9.16). We then arrive at the 
equation for entropy balance 
ds 2 dQ.,, 2x 24 _ 5, Ma 
did Gee D o Ha dt 
or, making use of (7.8) and (9.12) and writing down the total derivative of 
the entropy, we find 


Cee (Da) 
Par OSN erie OPA f 
P (ar "k ax, TOSA L IAY o T 


dor, 2, D (9.17) 


(9.18) 








(9.19) 


This equation is conveniently written in the form 


p (2+ e-v)s) =- (eee N o 


l 9 Ha Oik ðu; 
DR (D,a) 5 
T 2 J (r ¥) r) T Ox, (9.20) 





48 PHYSICAL KINETICS Ch. 2 
or, introducing the entropy flow density vector js defined by the equality 
is= 7(ir- D jdu, ) (9.21) 


and denoting by © the quantity 


2=jr' vi- A2 j29 (r- v re) ee a, (9.22) 
we write (9.20) in the form 

e(t: vs) =- V-js+20 (9.23) 
or in the form 

20S — V-(pSvt+js)+ È. (9.24) 


Eq. (9.20) shows that entropy is not conserved: the change of entropy in a 
given volume per unit time is associated not only with its dispersal with the 
moving medium via thermal conduction and molecular diffusion, but also 
with the appearance of the quantity © defined by the velocity. This quantity 
is called the entropy production. 

For an incompressible fluid, the last term of the entropy production is, ac- 
cording to (9.4), related to the dissipation due to viscosity. For compressible 
fluids it has the same meaning but is related to the first as well as second vis- 
cosity*. 

In a one-component system there is no diffusion flow, jp = 0, and the cor- 
responding terms of (9.21) and (9.22) reduce to zero. For a single incompres- 
sible fluid eq. (9.23) can be written in the form of the equation for tempera- 
ture. 

It is appropriate to recall here that by an incompressible fluid we mean a 
fluid moving slowly compared with the velocity of sound. For such a motion 
the density of the medium does not depend explicitly on coordinates and 
time. Under non-isothermal conditions it can, however, depend on tempera- 
ture and, consequently, vary in time and space. Nevertheless, the continuity 
equation for a non-isothermal incompressible fluid is defined by formula 
(7.4). 

The entropy of a fluid can be written as a function of temperature and 
pressure. We express the derivatives of entropy with respect to coordinates 


* See L.D.Landau and E.M. Lifshitz, Fluid mechanics (Pergamon Press, London, 1959). 


§9 ENERGY CONSERVATION LAW 49 


and time in terms of the derivatives of temperature. For this we write the 
basic equalities 
as _ (Gales ar _ ea 1 aT _ Cp aT 


ə \aT ar). Tat T I” 








as _ (25) aT _ Cp aT 
dx; \ƏT/ p Ox; T əx 


poar jp=0, 


Oik Ov, on ie r =) dv; n (a Ok ý 
T ox, T\0x, oOx;] 0x, 2T \ðxk Ox; 


Substituting these equalities into (9.20) with jp =0, we easily obtain 





dv; ERE 
pc, St +(v-V v)T) = vV. ov) + in( se os i z (9.25) 


This equation represents the equation of thermal conduction for a moving 
medium. 

Further simplifications can often be made in the equation of thermal con- 
duction. Although the kinetic coefficients x and 7 are functions of tempera- 
ture, this dependence can be disregarded for small changes in temperature. In 
this case the kinetic coefficients of a homogeneous system turn out to be con- 
stant at all its points, and can be taken out of the derivative sign. Further- 
more, for small velocities of motion the last term is usually (although not 
always!) small and can be dropped. The equation of thermal conduction then 
assumes the form 

ST evVT= xVr, (9.26) 
where x = k/pC, is called the thermal diffusivity. 

We have obtained a system of transport equations for moving continuous 
media. The set of mass transport equations, the equations of motion and the 
equation of energy (or entropy) transport form a complete system describing 
the motion of a medium. Indeed, the equations of motion contain five varia- 
bles (v, p, p) for a single fluid. The entropy equation introduces two more un- 
known functions; S and T. To determine the seven functions there are three 
equations of motion, the continuity equation, the entropy equation, the ther- 
modynamic relations S = S(T, p) and the equation of state p = p(T, p). The 
whole set of equations contains three kinetic coefficients — n, ¢ and K. 








50 PHYSICAL KINETICS Ch. 2 


We shall dwell neither on the boundary conditions with which these equa- 
tions must be supplemented for their solutions to become single-valued, nor 
on obtaining these solutions even in the simplest cases. All these problems are 
discussed in courses in the mechanics of continuous media (or hydrodynam- 
ics). 

Our purpose has been only to formulate, under certain assumptions, mac- 
roscopic (phenomenological) equations for a non-equilibrium continuous 
medium. It will be shown below that the macroscopic equations themselves, 
as well as the kinetic coefficients involved in them, can be obtained from kin- 
etic considerations on the basis of molecular concepts, at least for the simplest 


systems. 


§10. The Fokker—Planck equation 


We can now go on to the kinetic description of macroscopic systems based 
on the statistical description. 

We shall, first of all, formulate a theory of slow processes which has a very 
general although somewhat formal character. 

Let us consider an arbitrary macroscopic system in a state of incomplete 
equilibrium. We shall follow changes of its state in the course of the time in- 
terval T, satisfying the inequality 

Te S To ST, (10.1) 
where 7, and 7, are the relaxation times for the fast and the slow processes in 
the system. 

For incomplete equilibrium states it is possible to define various micro- 
scopically determined characteristics of the system. A microscopic state of the 
system will be characterized by certain parameters, the whole set of which we 
shall denote by A. This set of quantities À will be assumed to run over a con- 
tinuous sequence of values. The statistical description of the behaviour of the 
system will be carried out by defining the probability distribution function 
p(A, t) dA which characterizes the probability that at the instant of time ¢ the 
system is in a state in the interval A, A + dd. We shall consider the distribution 
function to be normalized to unity, f p(A, 1) dA= 1. 

Let the probability for the system to be in state Ap at a certain instant of 
time ¢ be p(Ag, ¢) dA. During a time At < Tp the state of the system will 
change and at the instant of time ¢’ = ż + At it will have the probability 
P(A, £ + Az) dÀ of being in the state À. 

Our further reasoning will be based on the following assumption. The 


§10 FOKKER—PLANCK EQUATION 31 


probability of getting into the state \ in time Ar is completely determined by 
defining the state Ag and does not depend on how the system got into the 
state Ap, i.e. does not depend on the prehistory of the process. Thus the prob- 
ability of transition from one arbitrary state into another depends only on 
this pair of states. In probability theory such states are said to form a Markov 
chain. Hence our assumption can be formulated briefly as the assumption 
that transitions between states of the system in time Ar form a Markov chain. 
In this case the transition probability w can be written in the form 
wep: A, Ar). If the transitions did not have the character of a Markov chain, 
then w would depend on the state Ag, in which the system was at ¢’ < £ and 
on the transitions by means of which the system got into the state Ag. 

We shall consider the probability w(Ap, A, Az) to be normalized to unity 


Jodo, d, A0) dA= 1. (10.2) 


Finally, we shall assume that the probability of transition Ag > A decreases 
rapidly with increasing difference |A—Ap|. This means that transitions with a 
large change of state of the system are unlikely. Transitions for which the 
state of the system changes relatively little have a high probability. In other 
words, we assume processes in the system to be slow. Quantitative restrictions 
imposed upon the properties of w(Ag, A, At) by this assumption will be given 
below. 

On the basis of the assumptions made one can find a very general equation 
defining the dependence of the distribution function on the parameters A and 
time. For this we note, first of all, that the probability p(A, t + As) dA can be 
written in the form 


PA, t+ At) dd= da fi p(Ag, t) (Ao, A, At) dìg, (10.3) 


where the integration is carried out with respect to all possible values of the 
variable Ap. As a matter of fact, p(Ap, £) w(Ag, A, At) dAg dA represents the 
probability that the system in the state A, will in time Ar go over into the 
state A. The total probability that the system at an instant of time ¢ + Ar will 
be in the state A will be obtained by summing over all possible values of Ag. 

The integral equation (10.3) is called the Smoluchowski equation. In deriv- 
ing it we made use only of the assumption that the states of the system form 
a Markov chain. We now use the assumption that the process is slow. We mul- 
tiply both sides of (10.3) by an arbitrary function y(A)/At which is only as- 
sumed to be continuous and to have the property y > 0 as |A| > °°, and then 
integrate over all values of A. Then we have 


52 PHYSICAL KINETICS Ch. 2 


x foo. t+ ANYA) = x So, Dar fwio, A, AN YA) AD. 
Further, we expand the function (À) in a series in powers of (A — Xo), writing 
AA) = Oo) +A) A0) +A AA) +. 
Substituting this expansion into the right-hand side of (10.4), we obtain 

Jean D Ap f Leo) + eq) A—Ao) + 
+ 49"(Ap) (A— Ag)? + ---] Wg, A, At) dA = 
= f PAo DYA) dào i w(Ap, A, Ad) dÀ + 
+ f PAo o'p) dào f (A— Ao) WAp A, Ad) dÀ + 
+4 foao Ne"(Ag) dry fA — A0)? wg, A, At) dA +... 
Let us consider the inner integrals with respect to the variable À. The first of 


these, according to (10.2), is equal to unity. 
In view of the rapid decrease in w with increasing (A —Xg), the integrals ex- 


ist and converge rapidly. It is clear that for such a behaviour of w the succes- 
sive values of the integrals 


Ip = [A-r waa 


(10.4) 


decrease rapidly with increasing 7. We confine ourselves to terms of the sec- 
ond order of small quantities. We then find 


foo, 1) dro JeQ) wo, A, Ad) dd = 
J Qo, Ag) Ap + [M,N Y'Ag) Mp, Ad dào + 


+4 Joo D"A) Mp. AD Ay, 


Hence 


x foo, t+ ANA) dA= 


1 t 
s + fi Pp, t) PAo) ddo + a> f PAn 1) ¥'(Ag) 14 Ao At) dào + 





§10 FOKKER—PLANCK EQUATION 53 


l " 
Par Jeo NY (Xp) /a(Ap, AD dr. (10.5) 


Denoting the integration variable Ag of the first integral on the right by A, 
transferring the integral to the left-hand side and passing to the limit Ar > 0, 
we find 


PA, t + At) — pa, t) 





li A) dA = 
ato At 20) 
; n0w AD) 
= f Po, t) Ap) (tim “So dào + 


fe _ L(g; At) 
+4 feo, ty (Ag) (tim r=" ang 


or, introducing the notation 





Ln rs ea NNEC 0 

renee At waco At W{No, A; At) dX= GOAD Gois) 
Ino, Ad) A=)? 

Ain sar ee J r= we A, 49 dr = DA), (10.7) 


and replacing Ag > A on the right-hand side, we have 
PAD cay ar= aN ANA, Nd + f DONA Dar. (10.8) 


Integrating the two terms on the right-hand side by parts and making use of 
the properties of the function y(A), we obtain 


Ja) eA) eA, D r= 0) A) AA) ao- [vA a (ap) dd = 
=~ [NA (ap) ar, 
in a2 
Joo'o, Nar= fon “pyar. 


Substituting these expressions into (10.8) and transferring all the terms to 
the left-hand side, we find 


foo] 2+ 5 D- =») | a=0. 


In view of the arbitrariness Parte we get finally 








54 PHYSICAL KINETICS Ch: 2 


Cp 2 3 gi 10.9 
a -al aD? |= one Goz 


Eq. (10.9) is called the Fokker—Planck equation. It defines the depend- 
ence of the probability density on the time and on the whole set of param- 
eters À in the case of an arbitrarily slow process. For brevity of notation, we 
did not take into account that the quantity À can represent an arbitrary set of 


parameters. 
In this case, without repeating the calculations, we can write 
ðP _ ð dp 
ry) = ~ ay, [ap Dik ane (10.10) 


where the summation is carried out over all the values of i and k. The coeffi- 
cients a;and Dig are of the form 


(Aj— ào) 


a;= me =A Wo; Ap At) da; 5 
(i= Aoi) Ak = ox) 
Dg= lin m f X wok: Ap» At) dà; - 


The Fokker—Planck equation can be given an obvious meaning, if the set 
of representative points corresponding to a set of identical non-equilibrium 
systems is considered. In this case the quantity p(A, ¢) can be assumed to be 
the density of representative points. Then the vector 


Á ð 
j= a p—s (DP) (10.11) 
at Ory ik 

represents the flux of the representative points in phase space. The Fokker— 


Planck equation has the mean of the continuity equation in this space. Multi- 
plying (10.10) by A; and carrying out the integration over A; we obtain 


frie, 


Q 
: >| 


= — fpd = -À= 


2 sus 2 
= Sh aX, (20-3 Ox) dà 


Thus, if we assume that p > O for À > œ, we have Àj = f a;p dà. The vector a 
is the mobility vector. The second term in the vector j does not contribute to 
the mean velocity. It represents the diffusion current. The tensor Dg is the 
tensor of the generalized diffusion coefficient. 


§10 FOKKER—PLANCK EQUATION 55 


Frequently the diffusion current j is conveniently rewritten in the form 


dp 


j;=a;—D} =. (10.12) 
i i ik dA; 
Then the Fokker—Planck equation transforms to 
OP Oh) php, 22. 
FY DA, (c0—D, T (10.13) 


The formula (10.13) represents the so-called second form of the Fokker 
Planck equation. In the particular case when the tensor Dz and the mobility 
a; are independent of the variable Aj, the vector j; is the direct generalization 
of the empirical laws (7.10) and (9.8). In this case the Fokker—Planck equa- 
tion is simplified and turns out to be the usual diffusion equation 


dp dp a20 
=~—a +D—. 
ðt Or an2 


In the stationary state and in the one-dimensional case the Fokker—Planck 
equation has the form 


ax e-i ow) =0. 


If the flux of particles tends to zero at infinity, we obtain immediately ap = 
a(Dp)/dX. Hence 


A 
_ const aN) n, 
PAN) = D exp lJ DO’) a : 





Let us now consider the case of the Brownian diffusion of a particle taking 
place in an external field of force. Owing to the action of the external force 
the probability of transition Wg, x, At), is no longer symmetric with respect 
to the displacements of the particle in the direction of the force and in the 
opposite direction. Hence the mean velocity 0 #0 and the Fokker—Planck 
equation can be written in the form 

ap__ 2 %__ a 


= —>-(0p)+D 


or Ox ax2 ax” OED) 


The probability current 7 is made up of the diffusion flux and the flux due to 
the action of the force f. As a rule, the mean velocity acquired by the particle 
can be written in the form 0= bf (see §56 of Part III). We then obtain the 
generalized equation of diffusion: 








56 PHYSICAL KINETICS Ch. 2 


Op _ (pL 
ar ax (Cs MP) a 0018) 
For an equilibrium state, eq. (10.14) gives 
ðP _ 
bfp = ae 0 


or, in a potential force field, 


W ppd- 
O rO AD) re 


or 
p= const e~ (bU/D)x | (10.15) 


For (10.15) to be the same as the Boltzmann distribution, the Einstein rela- 
tion must be valid for the connection between the coefficient of diffusion 
and the mobility b/D = 1/kT. Hence (10.14) can be written in the form 


dp_D a (au 20) a 
Ot kT Ox (Soo sar se ax’ 


where 


-p Dw 
IZ- Ure kT ax P” (10.14) 

For a stationary but non-equilibrium state, j # 0. 
We note that in §56 of Part III we have already arrived at formula (10.14'}, 
proceeding from an obvious consideration of the process of displacement of 


a Brownian particle. 
Integration of (10.13’) gives for the probability distribution 


p(x, t) = (4m DA)? exp {—(x + vt)2/4Dr} dx , 


i.e. a Gaussian distribution whose centre moves with the velocity v. 

As another example, consider the thermal transport of particles through a 
potential barrier. Let a system of non-interacting particles be in the region of 
minimum potential energy (potential well). We choose the bottom of the well 
to be the origin and assume that near the bottom of the well (fig. VI.3) 
U=}k,x2. Particles in the potential well are in a state of statistical equili- 
brium, so that 


ndx = w(0) e- UAT dx . 


§10 FOKKER—PLANCK EQUATION 57 






EN 


KEA 





Fig. V1.3 


Near the point of minimum energy one can write 
R2 
n dx = w(0) e- UAT dx ~ w(0) 0 1 IRT dx , 


where w(0) is the probability of finding the particle at the point x = 0. 
The total number of particles in the well is 


fen Oipy i 
N=w(0) f eK PAT dx = w(0) (27kT/k,): . 
We have extended the range of integration to the interval (— %, °) in view of 
the rapid decrease of the integrand. 
In the region x ~ 1 let the potential well border upon a potential barrier 


whose top is at the point x = 1. We shall assume that the barrier height Up 
satisfies the inequality 


Ug kT. (10.16) 


Behind the potential barrier the particles again get into a potential well. If the 
barrier height satisfies inequality (10.16), the number of particles penetrating 
the barrier is very small. In this case, thermal diffusion through the barrier is 
such a slow process that it can be considered as stationary. 

Making use of (10.14'), we find for the current j 


(w eUkT)|0 


Dw(0) 
Ja) 7 aaa 
feoai feUT ax 


The potential energy near the maximum can be written in the form 





58 PHYSICAL KINETICS Ch. 2 


U= Up — k(x — X max)? . 


In view of the rapid decrease of the integrand, the integration range can be 
extended to the entire x-axis. We then find 


e a —knx2/9k7 x 
fJeUKT ax =V AT f eFax PART ay = YOK oak TK)? . 
— 00 


Correspondingly 


he NE 
A 2 —Up/k 
j= wol) eee (10.17) 


Dividing by the total number of particles at the point x = 0, we find the 
probability that a particle which was initially in one potential well will pene- 
trate through the barrier and get into the other well: 





(kika? _uyer 
-i- aie CLA (10.18) 





P 


Formula (10.18) is applied for calculating the rates of chemical reactions. In 
this case, AU represents the difference between the energies of the initial and 
final products. 

In what follows we shall frequently have to deal with slow processes and 
shall see that their behaviour is described by equations of the Fokker—Planck 
type. 

In consequence of its general character, the Fokker—Planck equation gives 
no detailed information about the behaviour of a system of particles. It has a 
quasi-macroscopic character and contains unknown coefficients whose value 
must be determined experimentally or found on the basis of the kinetic des- 


cription of macroscopic systems. 


§11. The basic kinetic equation 


The development of an arbitrary quasi-closed subsystem is defined by the 
equation (2.3) for the density matrix. It is impossible to obtain its exact solu- 
tion. Hence, instead of an exact equation for the density matrix, use is often 
made in physical kinetics of the so-called basic kinetic equation which we are 
now going to derive. 

Let us consider a closed system consisting of a large number, JN, of inter- 


§11 BASIC KINETIC EQUATION 59 


acting particles. We assume that the interaction between the particles was ab- 
sent for times t, — < t < 0 and was switched on at the instant of time r= 0. 

The wave function of the system at t= 0 can be represented by the ex- 
pansion 


Wo a 2 (9) Yin , 


where WY, is the set of eigenfunctions of certain operators describing the sys- 
tem of particles. 

The interaction between the particles leads to a change in the state of the 
system, and its wave function varies in time Yo > W(t). The wave function 
W(t) can again be expanded in terms of the functions W,,, 


W(N= Welt Yi- 


We shall assume the interaction to be weak and shall make use of time depen- 
dent perturbation theory. 

The probability that at an instant of time ¢ the system will be in the ith 
state is given by the quantity lel. 

According to (55.8) of Part V, 


ORGU (11.1) 


where 
[er] = | 22 Hi,c{1 — exp [(i/h) (ep~ €)/t] /2(€,— ep] 


Here Hj, is the matrix element of the interaction operator H’ between the 
wave functions Y; 

We now pass from the quantum-mechanical description of the system to 
the statistical one. For this, as was explained in are we replace the values 
|c,(t)|2 by their mean values averaged over time, |c} (2. In this case we adopt 
the random Phase t hypothesis | 2.7). In averaging the square of the sum (11.1), 
all products cO C% and cl Ck Da, reduce to zero. We then obtain 


IXA = 02+ D H OD) , (11.2) 
where 
1 — cos [(e;— ex) t/h] 


UE SO e Id . (11.3) 
€;— Ek 





60 PHYSICAL KINETICS Ch.2 


Formula (11.2) defines the probability of transition in time ¢ of particles 
from the kth state into the th state. The number of particles in the kth state 
is proportional to |c,|?, so that 


AN} = 27 |H}? N DO). 


Here AN; is the change in the number of particles in the ith state, associated 
with it being occupied by particles which were in the kth state. 
The balance of particles in the ith state can be written in the form 


AN, = AN} — AN; = © 1H}/71N,(0) D()— 
— ŽD IH? NODA = 2 1H}? (N,(0) —N,(0) DC) . (11.4) 


The first term represents the number of particles getting into the ith state in 
time ż, the second term is the number of particles leaving it in the same time. 
As we have seen in §56 of Part V, for t> the factor D(f) is one of the 
representations of the 6-function: 


DO) > Fr tôle Ep- (11.5) 


as t > o0, 
Assuming that the time ¢ is long enough for formula (11.5) to be used, but 
very short from the macroscopic point of view, we can write 


AN;= 5p t D Hil? Ng- N) Ek); (11.6) 


or, since N;is a microscopically defined quantity, we can replace AN,/t by the 
derivative and write finally 


dN; 
a= Z WilNe- ND» (11.7) 


where Wip denotes the probability of a transition from the kth into the ith 
state: 
T , 
Wig = zp Hik? 5(E;— ep) - (11.8) 
Eq. (11.7) is called the basic kinetic equation. It plays a very important 


role in physical kinetics. It can be rewritten in an equivalent form, introducing 
instead of the number of particles the probability of occupation of a given 


§11 BASIC KINETIC EQUATION 61 


state, w;. We then have 
dw; 


ad = DD; WixQvy — Wj) - (11.9) 


It should be stressed, first of all, what the difference is between the kinetic 
equation (11.7) and the exact equation for the density matrix (2.4). The kine- 
tic equation contains only the probability w; but no probability amplitudes. 
In other words, it contains only the diagonal elements of the density matrix 
Prw 

We have presented its derivation, reproducing to a considerable degree that 
of formula (56.11) given in §56 of Part V, in order to elucidate the assump- 
tions made and the limits of applicability of the basic kinetic equation. 

Let us begin by considering the applicability of formula (11.5). The transi- 
tion to a ô-function can be made if the interval of values Ae = €;— €, for all i 
and k satisfies the condition Ae-t~h. For a macrosystem Ae < kT, so that 


t>njkT. (11.10) 


As we have stressed, the laws of quantum mechanics are reversible and do 
not change under the substitution ¢ > (— f£). This is reflected in the symmetry 
of transition probabilities or in the principle of detailed balance: 


Wig = We; (11.11) 


We have also seen in quantum mechanics (see §98 of Part V) that the 
principle of detailed balance is not associated with the application of pertur- 
bation theory nor with the concrete form of H’ in formula (11.9), but is of a 
general character. 

The irreversibility of processes in kinetics arises in averaging the coeffi- 
cients le? over time. This averaging is based on the random phase hypoth- 
esis underlying the derivation of the kinetic equation given above. 

The random phase hypothesis is used not only for the initial state of the 
system but also for all other states. In essence, the picture of development 
described by the kinetic equation (11.7) amounts to the following: 

In a certain time ż the interaction brings the system from the initial state 1, 
representing a mixed state, into state 2. Then phases get mixed, and thereupon 
the system goes over from the mixed state 2 into state 3 and so on. The ap- 
plication of perturbation theory restricts the time ¢ over which formula 
(11.8) can be used. 

Suppose the interaction has the character of collisions and is characterized 
by a certain time 7. The transition probability must then satisfy the condition 
Wr < 1. This inequality can be written in the form 





62 PHYSICAL KINETICS Ch. 2 


Ae kT 
h h- 

It should be noted that, in spite of the wide range of applicability of the 
basic kinetic equation, the reasoning and assumptions underlying it do not 
appear to be quite cogent. 

The picture of the mixing of phases does not seem to be well substantiated 
and even seems inconsistent. Indeed, if after a transition the system “forgets” 
its past, then a unidirectional development in time cannot be understood. 

In recent studies there has been success in obtaining a much more cogent 
derivation of the basic kinetic equation, allowing one not only to give up the 
phase mixing hypothesis but also to give an answer to the general question of 
the nature of irreversible processes. 

We cannot expound the modern theory and give the derivation of the kine- 
tic equation within the framework of this book. Only the most important 
concepts of this theory can be presented here*. 

We have emphasized that the development in time of macroscopic systems 
possessing a very large number of degrees of freedom differs essentially from 
the corresponding processes in microscopic systems possessing a continuous 
spectrum. 

A system undergoing scattering in the continuous spectrum has an infinite- 
ly large density of states or, in other words, possesses an infinitely large num- 
ber of degrees of freedom. However, the character of scattering processes is 
essentially different from that of processes in macroscopic systems. Scattering 
processes are strictly reversible in time, whereas processes in non-equilibrium 
macroscopic systems are always irreversible. Thus the presence of a large 
number of degrees of freedom in the system does not in itself mean irreversi- 
bility. It turns out that the difference between macroscopic systems and 
microscopic systems with a large number of degrees of freedom is associated 
with particular features of the form of the Hamiltonian. 

Let us consider a closed macroscopic system which at the initial instant of 
time is in a pure state. The Hamiltonian of the system of particles can be 


written in the form 
H=H)+2dU, (11.13) 
where Ho is the Hamiltonian in the absence of any interaction, U represents 


the interaction energy, and À is a small parameter. 
In scattering processes the interaction energy has singularities at certain 


1 
do (11.12) 


* See, for example, G.V.Chester, The theory of irreversible processes, Rep. Progress in 
Physics V, ch. XXVI, p. 411 (1963). 


§11 BASIC KINETIC EQUATION 63 


points of space, whereas in macroscopic systems U is distributed in space 
throughout the volume of the system. This is the first specific feature of 
macrosystems. 

In the Heisenberg representation 


A(t) = eär (0) e- Unit p (11.14) 
so that 


Ô) = Tr {e 0P Fe- 0t p(0)}. (11.15) 


In—1 


a t ti 
e-0Mt = e UHot 4 (iyn fdr f deizm fO dt, x 
0 0 0 


For a Hamiltonian of the form of (11.13) one can write the expansion 
X U(t) Ulta) ... U(ty) > (11.16) 


where U(t) = eM Hot YW) -amot If one substitutes the expansions 
(11.16) for eWHt and e- <(WA)At into formula (11.15), then the latter will 
contain terms of the form 


2 Tr { ULUp(0)} = 2 2 (0) Y; l ULUI Yp Ygl ULUI Yp, (11.17) 


where W% is the set of basis functions of the Hamiltonian without interaction 
Ho, 

Terms proportional to higher powers of A will have an analogous form. 

Study of expressions of the type (11.17) shows that with certain restric- 
tions imposed upon the form of the operator U they display a characteristic 
behaviour: owing to the fact that the interaction energy is distributed 
throughout the volume of the system of particles, the number of terms in ex- 
cited states in sum (11.17) for į = l is N times larger than for i # l. 

For a system with a very large number of particles M> (N/V being 
finite), this behaviour of sum (11.17) leads to the appearance of the factor 
5(W;— V). This allows one to retain the sequence of major terms in the total 
expression for the mean W) representing an infinite series. Summing these 
major terms leads automatically to the kinetic equation (11.7). This behaviour 
of the matrix elements in (11.7) is characteristic only of systems with the 
property of U mentioned. It does not take place for the corresponding scat- 
tering processes in microsystems. 

In this derivation it has been shown that no application of the random 
phase hypothesis to intermediate states is needed. A statistical description of 
the initial state assumes the random phase distribution only for this state. 





64 PHYSICAL KINETICS Ch. 2 


Finally, the notion of a weak interaction was defined. That is, weakness of 
the interaction means that the duration of the process of interaction (‘colli- 
sion’) must be small in comparison with the time lapse between two consecu- 
tive interactions. 

Studies of this kind have confirmed not only the validity and the wide 
region of applicability of the basic kinetic equation but also established its 
relation to the time equation for the density matrix. We shall return to this 
problem later. 


§12. Discussion of the basic kinetic equation and some simple examples 


We now pass on to the discussion of the basic kinetic equation (11.7). Its 
simplicity is usually only apparent. Transition probabilities W, depend on the 
numbers N; and Ng. This will be particularly clearly seen in following sec- 
tions, where it will be shown that a change of state of the particles of an ideal 
gas arises from collisions, primarily pair collisions. In this last case the proba- 
bility of transition is proportional to the number of colliding particles in the 
two states. If one passes from summation to integration, then the kinetic 
equation (11.7) will transform into a non-linear integro-differential equation. 
Hence there is only a very limited number of cases in which one can obtain 
solutions, if only approximate, of the kinetic equation. Some of these will be 
found below. 

We are going to dwell on some simple applications of the basic kinetic 
equation. 

Let us consider, first of all, a simple set of atoms which can be in two 
states, 1 and 2. Let the energies of these states be €, and e23. If this system is 
closed and in a state of statistical equilibrium, then the following equalities 
hold: 


dN, 

=o ON 2950), (12.1) 
dN> 

Sap OAOT (12.2) 


One can apply the principle of detailed balance directly to the closed sys- 
tem and obtain the obvious equality Vj, = M3. 

One of the main problems in the theory of irreversible processes is the 
solution of the question: how can the irreversibility of macroscopic phenom- 
ena arise if the behaviour of microscopic particles is strictly reversible. At the 


§12 DISCUSSION OF BASIC KINETIC EQUATION 65 


same time the theory of irreversible processes is supposed to answer a lot of 
more practical questions in such areas as the laws of kinetic behaviour of vari- 
ous macroscopic systems, including the calculation of kinetic coefficients. 
Within the last ten years great progress has been made in both these direc- 
tions. The basic kinetic equation was derived in a quite satisfactory way. Fur- 
ther on we shall return to the question of irreversibility in the example of the 
simplest physical system, i.e. the ideai gas. 

Let us now consider a system of atoms in a state of statistical equilibrium 
with a reservoir. The principle of detailed balance is not directly applicable to 
such a system. It is applicable only to the closed system ‘the set of atoms + 
the reservoir. 

For the number of atoms in levels 1 and 2 one can write the same expres- 
sions (12.1) and (12.2). However, in these W3 # W ,. Wia now denotes the 
probability of change of state of the closed system, corresponding to the 
transition 


atom in state 1 atom in state 2 


reservoir in one | ? \ reservoir in any of the states 
of the states Ey with energy Eg —(€2—€)) - 


Correspondingly, W>, represents the probability of the transition 


atom in state 2 atom in state 1 
reservoir in any of the states | ? \ reservoir in one 
with energy Ey — (€2— €) of the states with energy Ep. 


Since the states of the reservoir in transition 1 > 2,2 > 1 can be different, it 
is clear that the principle of detailed balance cannot be applied to these transi- 
tions. 

We can apply the canonical distribution to the probabilities W)> and W,, 
writing 
Wy2_ 2(Eo—(€2—€1)) 


Wr,  2(Ep) 
= eo Eo- (eg—€))—a(Eo) — e7 @aldE)(e2—€1) $ ee —e2)/kT 


Hence we find 


Wija Wrze DET (12.3) 





66 PHYSICAL KINETICS Ch. 2 


Then (12.1) and (12.2) lead to the Gibbs distribution: 
W 
(0) — y0) — 12 _ yO) ,—(€2—e1 KT 
NÌ =N; Wy, M Cue ae (12.4) 


Let us now consider the same system in a reservoir, which at the instant of 
time ¢ is in a non-equilibrium state. Then 


dN; dN2 
een feel Wi2— Wai N2 = -g7 - (12.5) 
Suppose that the system is in a state which is near equilibrium. Then we have: 
N,=NO+N,, N SNO, (12.6) 
N,=N+N,, N, SN®. (12.7) 
In this case we find 
dN; , + dN3 
= MUA UESN cae (12.8) 


In the kinetic equations (12.8) one can substitute for W,> or W3; the equili- 
brium relation (12.3). Then 


d(N5— 4) 1 , , 1 — — KT: 
=R e ANB n Wi) A A 





dt 
For high temperatures (e2 —€,)/KT <1 and, consequently, 
av. — N1) fe Oni) 
- gq  ~2WaM2-N;)= 7 , 


hence 
N3—Nj =(N3-Ny)oe". 
The number of systems in the non-equilibrium state decreases exponentially 
with a relaxation time 7 equal to 
T=1/2W, . €12.9) 
Let us pass to the consideration of a more complex system, where a reser- 
voir contains a system of atoms and radiation. 


Atoms in state 1 of energy €} can absorb photons of energy hv = e3 — €}. 
The probability of such an absorption is 


Wi2= By 20, T), 


§12 DISCUSSION OF BASIC KINETIC EQUATION 67 


where p(v, 7) is the energy density per unit frequency interval (see §73 of 
Part III), and B43 is the factor of proportionality. A decrease in the number 
of atoms in state 1 is given by the expression 
daN; 
-yz 7 8122C. TN « (12.10) 

Atoms in state 2 go over into state l as a result of two processes: spon- 
taneous and induced emission. Spontaneous emission takes place in the ab- 
sence as well as presence of radiation. We shall denote the probability of 
spontaneous emission for the transition 2 > 1 by A31. 

On the basis of the principle of detailed balance an emission process in- 
verse to the absorption process must take place in a system of atoms interact- 
ing with radiation. This emission is called induced emission. The probability 
of induced emission is equal to 


W21 = B2120, T), 


where 
Bo, = B12. 
Hence the change in the number of atoms in state 2 is given by the formula 
dN3 
— Gp 7422+ Bai pl», T)N >. (12.11) 
If the system (atoms + radiation) is in equilibrium, then 
Hh GN, 
dt dt 
and 
N 1B 2, T) = A21 N3 + Bop, T)N2 , 
or 


No jn By 2p, T) 
N; A>, + By2p(¥, T) ` 
On the other hand, for a state of equilibrium N/M, is expressed by Boltz- 


mann’s formula. Hence for the equilibrium density of radiation energy we 
find 


(12.12) 


OPONT a a 
By e2—ewkT _ 4 By eltulkT _ | 2 


p(v, T)= (12.13) 





68 PHYSICAL KINETICS Ch.2 


We thus arrive at the Planck formula. 

The ratio A 5,/B,> can be found by passing to the Rayleigh—Jeans formula 
(i.e. going to a small value of hv/kT <1). 

The derivation of the Planck formula presented above was given by Ein- 
stein. Its generalization to the case of atoms with an arbitrary number of 


levels presents no difficulty. 


§13. Non-equilibrium systems with a negative temperature. The amplification 
of electromagnetic waves by such systems 


Statistical systems possessing a finite number of levels have certain re- 
markable features. We stress, first of all, that if all energy levels of the system 
lie in a limited energy interval, then the argument given in §17 of Part III, 
concerning essentially positive values of the temperature, loses validity. In- 
deed, in §17 of Part III it was stated that if the temperature @ were less than 
zero, then the Gibbs distribution could not be made to obey the normaliza- 
tion condition and would be meaningless. However, if the energy runs over 
only a finite sequence of values, normalization of the Gibbs distribution can 
be carried out for any value of 8. 

Let us consider a system of N atoms with a finite number, n, of energy 
levels, placed in a reservoir. Although in reality atoms do not have a finite 
number of levels, for subsequent reasoning it is sufficient that a group of 
close levels be separated from other levels by a large enough energy interval. 

At T=0 all atoms are at the lowest energy level. For T> T, where 
T= (€y —€,)/k, the level distribution becomes uniform. For intermediate 
values of temperature the number of atoms in a level with given energy, often 
called the ‘level population’, is defined by formula (12.4). 

Let us now imagine that the atoms are not left to themselves but that ener- 
gy is supplied to them from outside. We shall consider below one of the ways 
in which, in practice, energy can be supplied to a system of atoms. If in the 
system of atoms there is also a mechanism of energy loss, then under certain 
conditions the system will get into a non-equilibrium but stationary state. The 
amounts of energy supplied and lost will be equal to each other, and a time- 
independent energy distribution of atoms will be established in the system. 
This distribution will be different from the equilibrium one. The level popu- 
lation of the system will be different from (12.4). Namely, under certain con- 
ditions, in particular for a sufficiently large value of the energy supplied to 
the system, the upper level population may become larger than the lower 
level population. 


§13 NON-EQUILIBRIUM SYSTEMS 69 


If we desired to describe the state of our non-equilibrium system in terms 
of the Gibbs distribution (12.4), we would say that the system has a negative 
temperature for describing stationary states. Such a terminology is very con- 
venient. 

Obviously, the behaviour of the temperature scale is defined by the fol- 
lowing reasoning, which we shall, for simplicity, carry out for the example of 
a system with two levels. 

At T= 0 all atoms are at the lower level, so that the entropy of the system 
is equal to zero, When T > © (actually when T> T,) both levels are uniform- 
ly occupied, and the entropy is a maximum. It is clear, however, that for 
T>-—c the same result is obtained: the population of the upper level is 
equal to that of the lower one. When |7| decreases in the region of negative 
temperatures the particles progressively go over to the upper level, and for 
T> 0 all the particles turn out to be at this level, whereas the lower level is 
not occupied at all. The entropy of the system is again equal to zero. 

Let us now consider quantitatively one of the possible methods of ob- 
taining systems with negative temperature. 

Let there be, for example, a plasma in which electrons and atoms (or ions) 
have different temperatures (i.e. different mean kinetic energies), and let the 
atoms have three energy levels €}, € and €3. We shall write the kinetic equa- 
tion for the number of atoms in these levels. 

In collisions with electrons, atoms go over to an upper level with probabili- 
ties Wig (k >i). Corresponding to such collisions there is transformation of 
kinetic energy of the electrons into internal energy of the atoms (collisions of 
the first kind). In addition, on the basis of the principle of detailed balance, 
there occur collisions of a second kind in which the internal energy of the 
atoms is transferred to the electrons with probability W,; (k > i). Finally, 
atoms in excited states can emit energy with probability Wemis 

Taking into account these processes, we can write for the number of ex- 
cited particles 


daN; 

— gy = WiN + Wia — Wara — W31 N3 — WERN2 — WERN3 » 
dN3 

ret Gian. z. AD) 
dat Mara + Wag ot Wemis N27 Woy — W32N3 = WèmisN3 > 
dN3 

= he (31) 82N _ e 
dt 7 W313 + W32N3 + Wemis N1 + Wemis N3 ~ M13 ~ Wo3NVp - 


In the case where excitation by electrons is compensated by de-excitation 





70 PHYSICAL KINETICS Ch. 2 


by collisions of the second kind and emission of radiation, a stationary state 
arises in which all derivatives with respect to time reduce to zero. Assuming 
the probability of transition from the second to the third level W 3 = W33 to 
be equal to zero, and also setting we?) equal to zero (forbidden transition), 
we obtain from the second equation 

—W,,N,+ WN, - WEN, =0. 


emis 


If, furthermore, the probability of emission Wwe is larger than the probabili- 
ty of collision of the second kind Wai» then 
W. 
mai 


= ——— N 
Oi) 
Weni 


Under the condition wG)> W315 the third equation gives 
(21) 
Wig Wig W 


E yan Wy? 
Wemis emis 12 
For the probability ratio W,3/W,2 one can write a formula analogous to 
(12.3) which involves the temperature of the reservoir with which the energy 


transfer takes place, in the given case an electron gas with temperature Ty: 
P ’ g p el 
= —(e2— €1)/kTel 
Wi2= Wo e =a 


W3 = Ws, e7 €37 €1)/kTel 5 


Wir W e7 €s— ea kTel 





Wiz W3 
Hence, finally, 
Wat (=e kT 
2 oye 27 *1 cl; (13.1) 
Wi’ 
emis 
wD w 
= _emis 31 —(€3—€2)/kTe| 
3 Woy web. N3. (13.2) 
emis 


The final formula involves the ratios of probabilities of two possible modes 
of transition from levels 2 and 3, via emission, and via collisions of the second 
kind. By assumption, the first probability is much higher than the second. 
Since e€, > €}, it follows from (13.1) that Vy <N}. On the contrary, although 


§13 NON-EQUILIBRIUM SYSTEMS 71 


€3 > €, this formula leads to values N3 >> for a sufficiently large value of 
the factor in front of the exponential in (13.2). This means that, owing to the 
fact that the transition from the third to the second level is forbidden, colli- 
sions with electrons bring a larger number of atoms to level 3 than to level 2. 

Thus level 3 has a negative temperature with respect to level 2. We stress 
that the procedure considered for obtaining negative temperatures is not the 
most important and most widely used in practice. However, it makes it possi- 
ble to elicit the physical conditions necessary for this in the simplest way. 

We can now pass on to the properties of systems with negative temperature 
mentioned above. 

If the interaction of a system with radiation is considered for T < 0, then 
it is immediately clear that this interaction differs essentially from that of a 
system at a temperature 7 >0. Suppose that the system is acted upon by 
monochromatic radiation of frequency 


v = Ae/h = (€3—€9)/h. 


This radiation will be absorbed, and the intensity /~ absorbed will be propor- 
tional to the number of atoms in state 2 and to the radiation density: 


I~ = Baz NaP . 


In addition to the absorption, spontaneous and induced emission by 
atoms in state 3 will take place. The emitted radiation intensity is equal to 
(see (80.14)) 


I” = B33N3p + A3303. 
The difference between the intensities is 
iol = B32P(N3 —N) +Az.N3 . 


If N3 — N, > 0, i.e. if level 3 has a negative temperature with respect to level 
2, then, on passing through the system, the radiation will not be attenuated 
by absorption but amplified owing to stimulated emission. 

The operation of quantum-mechanical amplifiers and generators (masers 
with molecular systems or lasers with atomic systems) is based on this effect. 
They are applied ever more widely in modern radio engineering. Lasers are 
the most effective generators in the infra-red and optical regions nowadays. 








The Kinetic Theory of Gases 
and Gas-Like Systems 


§14. Boltzmann’s kinetic equation 


We have emphasized above that solving the basic kinetic equation is a mat- 
ter of considerable difficulty. Hence, major practical results have been ob- 
tained by considering those physical systems for which the basic kinetic equa- 
tion can be replaced by a simpler kinetic equation. 

The basic kinetic equation defines the numbers N,, or the state distribu- 
tion of particles, taking into account all the connections and interactions ex- 
isting between the particles of the system. 

However, it is clear that in a number of cases such a description of macro- 
systems is too detailed. This is particularly the case for ideal gases. 

In the classical approximation, to which we shall restrict ourselves in the 
following, instead of enumerating the number of particles in a given state we 
can describe the state of the system by a continuous distribution function. 
The latter, by virtue of the absence of interaction between the particles, may 
be separated into a product of distribution functions of individual particles. 
The definition of the distribution function of an individual particle allows 
one to describe completely the properties of the ideal gas as a whole. 

Let us consider the distribution function for molecules of a non- 
equilibrium ideal gas. We know that in an ideal gas each molecule can be con- 
sidered to be a quasi-closed subsystem. 


72 


§14 BOLTZMANN’S KINETIC EQUATION 73 


As distinct from the case of an equilibrium gas, the distribution function 
depends in general on coordinates x, y, z, Momentum components Py, Py» Pz 
and time ¢. In what follows it will be convenient to introduce the following 
notation. Let dn be the number of molecules whose representative points lie 
in a phase-space element 


dy = dx dy dz dp, dp,, dp, = dp dV 
at the instant of time t. Then 


dn = f(r, p, £) dy , 


where f(r, p, f) is the distribution function sought. In the following we shall 
make use of the classical approximation. The change in time of the quantity 
dn (i.e. change in the number of representative points lying in a volume ele- 
ment dy) is determined by collisions between the molecules of the gas. 

If as a result of a collision between two molecules having momenta p, and 
P> one of them acquires momentum p, then its representative point will enter 
the phase-space element dy. If, on the contrary, a molecule having momen- 
tum p collides with another molecule and acquires a new momentum, its 
representative point leaves the volume dy. It is obvious that the larger the 
volume dy, the larger (other things being equal) the number of molecules 
whose representative points come in and go out of this volume per unit time. 
The change in the number of particles in an element of phase volume per unit 
time can be written in the form 


SEB D —ayay, 
dt 

where a dy is the number of molecules whose representative points go out of 
the volume element dy as a result of collisions of the type (p, P,) > (P2, P3); 
and, analogously, b dy is the number of molecules whose representative points 
come in to the volume element dy as a result of collisions of the type 


(P2, P3) > (P, P1). 
Thus we have 





74 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Obviously, the total derivative d/{p,r,¢)/dt can be written in the form 


daf_ af, af Px, af Wy, af Pz af dx, af dy, Of dz 
dt at dp, dt dp, dt dp, dt ox dt Oy dt dz dt 





of, dp af , dr af 


ðt dt 3p ` dt ər’ 
or 


dfr, p, t) _ of of p, of -F, 
dt ‘or m tap 





where p/m = v is the velocity of the molecule, and dp/d¢ = F is the force act- 
ing upon a molecule having momentum and coordinates lying in the phase- 
space element dy at the instant of time ¢. 

Finally, we find 


of, P of sie 
at £ Fre I (14.1) 


where J = b—a. 

The next problem consists of finding the quantity / = b — a, which is usual- 
ly called the collision integral. This integral can be found for a sufficiently 
rarefied gas in which collisions occur only in pairs. We shall assume that in 
collisions no transformation of kinetic energy into energy corresponding to 
other degrees of freedom occurs, i.e. that the collisions between the mole- 
cules obey the laws of elastic collisions between rigid spheres. 

Momentum and energy conservation laws, which can be written in the 


form 
Pı + P= P3 + P2, (14.2) 
pi +p =p3+ p3. (14.3) 


hold for pair-wise elastic collisions between identical particles. The process of 
elastic collision between molecules can be characterized by the differential 
cross section for scattering into a solid angle element dQ,. This cross section 
depends on the absolute value of the relative velocity of the colliding particles 
Ure] = |V —Vy| and the scattering angle a= a(v,V,). The differential cross sec- 


§14 BOLTZMANN’S KINETIC EQUATION 75 


tion for the scattering of a particle of velocity v into a solid angle dQ can be 
written in the form 


do = O(V;9},@) dQ, . (14.4) 

Each of the f(p, r, 2) dpdV particles having momentum p undergoes a cer- 
tain number of collisions per second with particles which have momentum p} 
and which are contained in a cylinder of height v,,, and base do. This number 
is just v,.,dof(p;,r, £) dp,. Hence the total number of pair collisions under- 
gone per second by the particles confined in the phase-space element dy is 
equal to 


Vr} Vrp &) dQ, AP, r, £) dpd VNp], r, £) dp, - 
To find the quantity ady in which we are interested, we have to take into 
account the fact that any collision undergone by a particle with momentum p 


leads to a change in p and to the particle’s representative point leaving the 
volume dy. Hence 


ady = dp dV f [v.10 VAP, r, OAD, r, £) dp; dQ, 


or 


a= f f vrei DAP, r, OP, F, £) dp; dQ, . (14.5) 


The integration is carried out over the entire solid angle Q} and all the values 
of momentum p}. Similarly one can find the number of collisions as a result 
of which representative points move into the volume dy, i.e. the number of 
collisions of the type 


(P2, P3) > (P, Py) - 


Obviously, one can write that the number of collisions of molecules having 
momenta p, and p4 per second is equal to 


Viel (Vepa) dQ f(Po, Y, AP, Y, t) dV dp dp3 > 


where v'e] = [V2 —V3l- 
The quantity bdy in which we are also interested is obtained by integrating 
the last expression with respect to all values of the momenta p, and p4 satis- 





76 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
fying conditions (14.2) and (14.3): 
bdy= AV f f foro) AP2, r, OAPs, T, t) dpa dp3 dQ . (14.6) 


The elastic collision laws allow Ure} to be expressed in terms of v,,), and 
dp2dp3 in terms of dpdp,. Namely, the relative velocity after collision 
Ve] = [Y2 — V3] is equal to the relative velocity before collision v,.)= |V — v41. 
To make the transition from integration over dpjdp3 to dpdp, one can write 


9(P2, P3) 


—————| dp, dp. 
ap, p) | PIP 








dp, dp3 = 


The modulus of the Jacobian of the transformation p2, P3> p, P, is con- 
veniently calculated by directing the vector p along the x-axis. 
A simple calculation gives 


a(P2, P3) 
a(p, pı) 


> 








hence 
dp2 dp3 = dp; dp . 
Substituting this last equality into (14.6), we find 
b= | foreo rep DAP2 1, KPa, r, 1) dP; dQ . (14.7) 


To carry out the integration, the vectors pọ and p3 in the functions 
S(P2; r, f) and f(p3, r, t) must be expressed in terms of p and p}. 
By means of the values b and a which have been found, the collision inte- 


gral can be written in the symmetric form 
I=b-a = f f vrer Vrer o) [Pa r, t) KPa r, t) — 


— KP, r, t) AP r, t)) dp; d9}. (14.8) 


Substituting the value of the collision integral into (14.1), we arrive at the 


§14 BOLTZMANN'S KINETIC EQUATION 77 
equation 


of of of _ 
ath oat F- 5p T/S Preto Prer IRP; r, OAp3, T, £) — 


— AP, r, Ò AP: r, OD) dp; dQ,. (14.9) 


In what follows, instead of f(p,r, £), AP T, £), APh, r, t) and f(p3, r, £) we 
shall for brevity write respectively, f, fj, f2 and f3, while o(v,,), &) will be de- 
noted by a. 

In this notation we have 


3 a a 
Seth La F = {foi Uas- dpi AO. (14.10) 


m 


The integro-differential equation (14.9) (or, in the brief notation, (14.10)) 
for the distribution function is called Boltzmann’s kinetic equation. 

The significance of Boltzmann’s equation extends far beyond the frame- 
work of the physical kinetics of ideal gases. As will be seen from a number of 
examples to be considered in this and other chapters of this book, a number 
of physical systems which are in essence very different from an ideal gas, but 
which formally satisfy the requirements underlying the derivation of Boltz- 
mann’s kinetic equation, are described by it. 

From the mathematical point of view, Boltzmann’s equation represents a 
non-linear integro-differential equation with partial derivatives. For Boltz- 
mann’s equation to assume a concrete meaning, it is necessary to know the 
dependence of the cross section on the relative velocity and on the scattering 
angle, as well as the force field acting on the particles of the gas. However, 
even for the simplest assumptions concerning the form of the function 
O(Vep &) and the character of the force field, it is a matter of considerable dif- 
ficulty to solve Boltzmann’s kinetic equation. 

Below we shall give methods of solving Boltzmann’s equation. In the 
meanwhile we note that instead of the momenta of the molecules it is often 
convenient to use their velocities as variables. Using the variables (v, r, £) 
Boltzmann’s equation assumes the form 


of Of JF OF 
St act an ay Td Vrela- Mi) dv; aQ. (14.11) 


In what follows we shall change from the variables p to the variables v or 





geo E TT a o ee E o ing 


78 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


from (14.9) to (14.11) without further notice. We shall also not specify the 
transition from vector to tensor notation in which (14.11) assumes the form 


F. 
Os yet Ff ovralfofs— fh] dv, 42. (14.12) 


Ər lx; m Ov; 
Boltzmann’s equation is sometimes written in a more symmetric representa- 
tion in which the integration in the collision integral is carried out over all 


values of the momenta of the colliding particles. Namely, by virtue of the 
momentum and energy conservation laws, one can write 


b—a= f ovel- ff] dp; dQ= 
= f oveilf2f3— ff] 8(P3 + P2— P1- P) dP; dp2 X 
X (E2 + €3—E€—€)dez dQ, (14.13) 
where the delta-function containing a vector argument, ê(p), means 


5(p) = 8x) 5(P,) 8 (P2) - 


Instead of integration with respect to the energy €3, integration with respect 
to the momentum p3 can be introduced, since 


de, dRê(e, + €,—€, —€)= 2m 6(p3 + p3—p?—p”) de, d= 


2p3 


2dp3 
=— dO dp, 6(p3 + P3—P}— P*) = 6(p3 + P3— Pi- P) 


P3 
Hence (14.8) can be rewritten in the form 


b—a= f ovrilf2f3= ffi] (P2 + P3— Pi =P) X 
a) dp; dp2 dp, 


= (14.14) 


A 1 
inne 2i % (+ 
(P3 P2- Pi p^) P3 Po 
Here instead of 2/p3 we have written the symmetric expression Dy tps 
making use of the complete equivalence of the momenta p3 and pp. 


§15 BASIC KINETIC EQUATION FOR CORRELATION FUNCTIONS 79 


§15. The basic kinetic equation for correlation functions 


The very simple and obvious derivation of Boltzmann’s equation given 
above suffers from a number of shortcomings, both theoretical and practical. 
Only pair collisions between molecules are taken into account in this deriva- 
tion for which the pair character of the collisions is essential. It is not possible 
to see how the derivation can be generalized to the case of triple, quadruple, 
and more, complex collisions. The entire region of applicability of the general 
theory is confined to the case of very rarefied gases. In addition, the follow- 
ing important theoretical aspect of the derivation which was presented is not 
at all clear. On the one hand, it is based on the equations of classical mechan- 
ics reversible in time. The motion of particles and their collisions obey strictly 
defined laws. On the other hand, it follows from Boltzmann’s equation that, 
by consequence of the collisions, molecular chaos is established in the gas, 
and the entropy of the gas increases monotonically, tending to a certain limit 
(see §19). It is not clear at what point the statistical, probabilistic character 
is introduced into the train of calculations. 

Many critics of Boltzmann’s work have seen in this a paradoxical and 
groundless result. It is therefore very important to obtain a more logical and 
obvious derivation of Boltzmann’s equation. We shall present a variant of 
euch a derivation, based on the use of correlation functions*. 

In Part III we defined correlation functions p,,,, for which a system of con- 
nected equations is “obtained in which correlation functions of lower order 
are expressed in terms of functions of higher order. 

In this case, however, we confined ourselves to correlation functions de- 
pending on the coordinates of the particles. We now have to generalize the 
definition of correlation functions to the case where they also depend on 
momenta and time. 

In order to avoid cumbersome notation, we shall denote the whole set of 
variables (p1, P2; --s P3N3 11> 42 = 43N; t) by (Xy, t). The distribution func- 
tion of a system containing M particles, p(x,, t), must satisfy the general 
equation 


dp(xy, t) b 


zr Pee (15.1) 


expressing the law of conservation of the number of representative points in 


* See N.N.Bogoliubov, Problems of dynamic theory in statistical physics, in ‘Studies 
in statistical mechanics’, ed. J.de Boer and G.E.Uhlenbeck, vol. 1 (North-Holland, 
Amsterdam, 1962). 





80 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch, 3 


phase space. We write the total derivative in the form 


dp _ ap , dp dp; ap dq; 


be te ad Ses dE 


dt ðt dp; dt dq; dt — 


_9p 0p 0H, dp 0H _ dp _ 


arse Ny, Op SE Oe OL {H; p} =0. (15.2) 





Eq. (15.2) is called Liouville’s equation. 
In the case considered, of a system of particles where the Hamiltonian is of 
the form 


2 
Pi 
H=2) mt? DU ;(\qi- q;l) 


and the Poisson bracket {H; p} is defined by the equality 


(eS ðH dp _ƏH dp 
4; p} = 2 (= ðp; Op; a) ‘ oe 2) 


the interaction between the particles is assumed to be pair-wise. We introduce 
the correlation function p,, defining it in the same way as in §47 of Part III: 


PAX 15%,» Xg, = VS f p(x1,x2, vey Xy) X54 dey. (15.4) 
In what follows we shall be interested only in the function 
P(X, =P, (r Pp =(V/N) AX), Py), (15.5) 


representing to within a factor V the distribution function of the particles of 
the gas, and in the binary function 


Py%1,%2, Ep1 Pi ro Pz 1) = (V7/N) An r2 Py, Past). (15.5') 


§15 BASIC KINETIC EQUATION FOR CORRELATION FUNCTIONS 


Making use of the definitions (15.2) and (15.3), we find 


ap p 
-5r =m% ee v f {H; p} dx, ... dxy = 


N 
Sy (E dp _ dH dp 
1 


dx; ... dxy = 
Op; ðq; dq; 32) DFM 


i= 


= dH dp _dH dp H 
rJ Aaa a) dx... dxy + 


N 


» 0H dp _ dH dp A 
+V 4 SE a- 2a; x) dx ... dx, 





Pi ðP ay dp ðH 
m ma dey EEn ONV 
N 
dH dp ƏH s) 
+v (22 32 24 20 dx, ... dxy . 
i=2 J Op; ðq; 94; OP; z N 
Evidently, we have 
(ORE E ee 
m ASEIN: itr V or,” 
Further, since all the particles are identical 
N 
8p | aU 1) ax 2 op DU z 
a 2 ar) dxa ~ dey = (N fap, Jr, a Oy 
—1 2912. $012 
Not fi or dp, dr dp3 . 


81 


(15.6) 


(15.7) 


(15.8) 





82 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch, 3 


Finally, the terms of the remaining sum can be written in the form 


TI dH dp _ƏH ap\, _ 
(firs dep 5 Ope axy f GE 32 a a) dx; 
a dp 

= fides. dxi- dri) =: aryl f È mdp; f 5 -dr 


Sc api] - 


For all reasonable assumptions of the form of the distribution function it can 
be considered that 


dp i 
f aar>o ap; {Pio (15.9) 


for an indefinite increase in the integration range. Thus all the terms of the 
third sum in (15.6) are equal to zero and, making use of (15.7) and (1 5.8) we 
obtain, instead of (15.6), 
dp dp ðU, dp 
l nial 1_N— 12 P12 | (15.10) 


Ot m ar 5 on op, "29P2- 


Since U} does not depend on momenta, we can write 


OUj2 OPi2_ U2 912 _ 2Uj2 ae 
or, Op, or; Op, op, 





es P12} 





so that (15.10) can be written in the form (dropping the index 1) 


ðP; p 9p _ N-1 
Qh m or V JS (Uiz o12}drzdp3. (15.11) 


Passing to the limit of an infinitely large system (M >; V > °°) having a 
large but finite specific volume, v = V/N, we have 


i a So fU fi2}dr2 dp2. (15.12) 


§16 DERIVATION OF BOLTZMANN’S EQUATION 83 


Eq. (15.12) defines the law of change of the distribution function f(r, p, £) 
relating coordinates, momenta and time for the particles in the gas. We shall 
call it the basic kinetic equation for the distribution function f. 

As in the case of an equilibrium gas, p} is expressed in terms of p49. In 
turn, reproducing the previous calculations, it is possible to obtain an analog- 
ous equation for P12(¥1, X2, £). It is of similar form and contains the ternary 
function p73. However, we shall not write down this equation, since for a 
rarefied gas the quantity 1/v is very small and is a small parameter in the 
equation for p}. If we are interested in the value of p} in the first approxima- 
tion, with respect to the small parameter 1/v, then we have to substitute the 
value of p; in the zero order approximation in expression (15.12). The latter 
can be found without the kinetic equation for p42. ` 


§16. The derivation of Boltzmann’s equation from the basic kinetic equation 


Let us consider a gas sufficiently rarefied that particles undergo only pair 
collisions. 

For simplicity we shall in what follows, assume the gas to be spatially uni- 
form, so that the distribution function p} depends only on momenta p and 
time ż, i.e. p; (£, x4) = P; (4 p). In this case the binary distribution function 
should not change when the two particles are displaced by a constant vector 
a, i.e. 


Pyolt, Fy +a, r2 +a, Py, P2) = P12(4 f1, 12, Py, P2) - 


This condition is fulfilled only in the case where pj, is a function of the dis- 
tance between the particles, r13 = r} — T2, P12 = 212(f: "12: Py, P2). Obvious- 
ly, we can introduce the following scale of increasing times: (1) the collision 
time 7, ~ ro/U, where rg is the radius of the interaction sphere, and v is the 
mean velocity; (2) the mean time between two consecutive collisions, 
T ~ Afv, where À is the mean free path. As we shall see in what follows, 7 
represents the relaxation time in microscopic gas volumes. (3) the macro- 
scopic relaxation time or hydrodynamic time, Tmacro ~ L/V, where L is a mac- 
roscopic length, for example the size of the container in which the gas is 
placed. 
It is obvious that always 


Tet See (16.1) 


Let us consider the behaviour of the gas in the course of time intervals At 
such that Tọ% Ar <r. In this interval only a few molecules of the gas have 





84 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


time to undergo collisions. Hence, the behaviour in time of the majority of 
particles in time intervals Ar is described by the function p}. 

Those few particles which, in an interval Ar have time to undergo colli- 
sions are imagined to be combined into pairs. It is clear that the number of 
pairs is small in comparison with the total number of particles, M. The behav- 
iour of the pairs is evidently described by the binary correlation function p15. 
Our problem is to establish the relationship between p; and p}2. This will be 
done by reasoning which, strictly speaking, is valid only for times of the or- 


der of Ar. 
Let us consider the behaviour of correlation functions in the limiting case 


1/v> 0. 
The equation for the distribution function for an infinitesimal gas density 
in a spatially uniform gas assumes the form 


dp,(p, t) = 
A 0. (16.2) 


Formula (16.2) has a simple meaning: in this approximation the particles are 

moving independently of each other. 
The solution of eq. (16.2) is conveniently written in the symbolic form 
piP, ) = SPp (Pp, t-7). (16.3) 


The operator s® acts upon the function p} which depends on the variables 
x, transforming the function taken at the initial instant of time f — 7, into 
the function at the instant of time ¢. Since without collisions the particles of 
a uniform gas are moving with constant momentum, 


P\(P, 4) = p\ (Pp, t—7,). (16.4) 
Analogously, one can write in the approximation 1/v > 0 an equation for the 
correlation function p>. Clearly, it is of the form 

dar = 0. (16.5) 
Correspondingly, the solution of eq. (16.5) can be written in the symbolic 
form 

pX, x x3) =. p lE- To Xp X3) - (16.6) 
The operator s2 acts upon the function p}23(¥1, X2) which depends on the 
pair of coordinates x} and x, transforming it from its value at the initial in- 


stant of time t—7,, to its value at the instant of time ¢. 
Eq. (16.6) describes the behaviour of a pair of molecules which are not 


acted upon by other molecules. 


§16 DERIVATION OF BOLTZMANN’S EQUATION 85 


It is clear that such a consideration and eq. (16.5) itself have a strict mean- 
ing only for time intervals smaller than the time lapse between consecutive 
collisions, 7. Since, however, this time is very large in comparison with the 
collision time, we can also use (16.5) approximately for large times, i.e. we 
shall assume that f°. This is the basic assumption of the theory we have 
given, which was developed by N.N.Bogoliubov. 

We write (16.6) in the form 


POG. x5) = Sp At =T Xp %2) = 
= SO Kp alt -To Xp Xo) —P (t= Tax) P C- ToX] + 


+S%p (t = TX) P -To X2). (16.7) 


We pass to the limit £ > —%, i.e. consider the correlation function long before 
the collision. It is obvious that for £ > —% the behaviour of a pair of particles 
is not correlated, and 

lim S@pj9(t— To X1 X2) = P(t — Te X1) P(t -To x2). (16.8) 


t-o 
Hence the condition of weakening of correlation can be written 
ee Pyo(t, X1 X2) = Py(t— TeX 1) Py(t — Te X9) (16.9) 
—>—Co 
The distribution function p} of a spatially uniform gas depends only on 
the momentum. To the limit z > — œ there correspond the limiting values, P} 


and P3, of the momenta of the particles which at the instant of time t= TA 
undergo collision, so that 


lim S-%p, =P, , (16.10) 
satel 1 1 
Jim Sr OP =P,, (16.11) 


where P, and P, are the limiting values of the momenta of the particles long 
before collision, when they are outside the interaction sphere. 

It is obvious that P} and P, are connected with the momenta p; and p3 of 
the colliding particles by the relation 


2 2 2 2 
PITZ Pi +P, 
baer aN U(r, ae (16.12) 





86 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Using (16.10) and (16.11) and taking into account (16.5), one can write 
(16.7) approximately in the form 


4 2? 
PINE, X 1X9) = i ST Pix To 2px) — 
—— oo 
=P- Teo Xy)Py(E— Fey XQ)] + lim SPP C- Te, 1) y(t ~ To ¥2) = 
—— 0 


= p,(0, P1); (0, P2) =p, (2, P,)p,(t, Po). (16.13) 


Formula (16.13) allows the binary distribution function p, to be expressed 
in terms of the distribution function p}. However, p13 is a function of the 
momenta p, and p, of the particles colliding at the instant of time ¢, whereas 
in formula (16.13) the functions p} depend on another argument; the limiting 
values of the momenta P, and P, before collision. 

If we now want to obtain the distribution function for a finite but small 
value of the parameter 1/v, we have to substitute the binary function of the 
zero order approximation (16.13) into the right-hand side of eq. (15.11). If 
we take into account the definitions (15.9) and (15.5’), we can write 


aoe P,) 
SH = f {Uiz Re, PDA, Pp} dry dpa. (16.14) 


We transform the Poisson brackets, making use of the fact that the momenta 
P, and P, are constant defined vectors. 
Because of this, 


{H; fu, P), P2)} =0. (16.15) 
Using (16.15), we find 


p> + ps 
| so + Uy; ft, Py) Mt, Pa =, 


Hence it follows that 
p? +p h 
{Uz f, P) f(t, P2)} =- ui Ir 2. Kt, P) Kt, P3] = 
Pi IN PSG, P) Po I PYN Pad _ 
om a(r; — r2) m a(r; — r2) s 





EEEE 0 
E dria i, P) Ki, P2)] . 


§16 DERIVATION OF BOLTZMANN’S EQUATION 87 
Thus eq. (16.14) can be written in the form 


aftt,p)_ ¢(P2-P) a ; 
P= f "griz Me PDA Pod] dra dpa = f7dp3, (16.16) 





where / denotes the integral 


[= Eres 5 - [fKt, P) At, P>)] dro. 
12 


To calculate this integral we introduce a cylindrical system of coordinates 
(r, y, £), choosing the vector Pa~ p, to be the positive direction of the polar 
€-axis. We then have 

x =x +trsing, 

Yı =yatrcosy, 

Zi = Z2 t£, 

(Pp2-P1)= [#2 - p1), 0,0] , 
driz =r dr dy dé , 


so that we can write 


P25 P] 
m 





(P2-Pp)): = Ue, P) At, Po)] = g UU PDA, Pa) 


The momenta P, and P, will now be functions of the coordinates r, y and E. 
We choose as the origin the point rj = 0, i.e. the point at which the collision 
takes place. For such a choice of axes we have 





AI Poa TA A 
j= fra f 4 J ag Ue PORE Po) dé = 


a E N NEP 
0 





88 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Let us consider the product 
Kt, Py) At, Payer? = fe — t", PRE- t", PIETS , (16.17) 


where t” <7. The situation > corresponds to particles a large distance 
apart (outside the interaction range). In this case the particles are going away 
from each other (this is seen most simply from the fact that (p3 — p,)/m = Urel 
increases with increasing £). This means that if the particles at the instant of 
time ¢ are a large distance apart, then in the past, at a certain instant of time 
t— t", they were close to each other, interacted and then separated. The mo- 
menta P} and P, in (16.17) denote the momenta of the particles before colli- 
sion, in particular immediately before undergoing collision. We write this in 


the form 
AP, DAP, OO = APT, DNP, À), (16.18) 


where P] and P3 are the momenta immediately before collision. 

The product f(P,, £) (P2, )i*°-” has another meaning. The situation 
£ + — œ also corresponds to particles which at the instant of time ¢ are a large 
distance apart, outside the interaction range. However, these particles did not 
have time to draw together and collide in the time interval from t—T, to ¢. 
They move with constant values of momentum, so that 


SPa, OP), DETTI = fo, OAD, À, (16.19) 


where p; and p3 are their momenta at the instant of time ¢. Substituting 
(16.18) and (16.19) into the integral /, we find 





= r * * 
ee Ma fore OARS, -AP 1, OAD, À) - (16.20) 


Further, substituting (16.20) into (16.16) we find that the kinetic equation 
assumes the form 


afp 3 t) * * 
af frar dy fv, NPI, D -KP DKP À] dp2. (16.21) 


Here we have introduced the relative velocity of the particles v= 
|P2—P |/m. The integrand involves the momenta Pj and P}-before the colli- 


§16 DERIVATION OF BOLTZMANN'S EQUATION 89 


sion. In view of the symmetry of two-body problems with respect to time 
reversal, one can write besides (16.21) 


INPI À 
= * = frardy [relMPs, OKP 4, =- KP2, DAPI 9] dpz, (16.22) 





where p3 and py are the momenta after the collision. Instead of the impact 
parameter r and the azimuthal angle, the solid angle and cross section can be 
introduced by the relation rdrdy=odQ Carrying out this substitution in 
(16.21), we arrive at Boltzmann’s equation 


NPr À 
a E f veo MP3, DAPa Fr, DAPI, 1 apy dQ. (16.23) 





In the case of a non-uniform gas, one obtains in an analogous way a some- 
what modified Boltzmann’s equation. We have seen that the transition from 
the reversible equations of mechanics, Liouville’s equation (15.2), to Boltz- 
mann’s equation involved a statistical stage, formula (16.13). Formula (16.13) 
is based from the very beginning on the assumption of asymmetry of the 
process in time. Fixing the gas states at a given instant of time ¢, we have an- 
alyzed the question as to how the gas got into a given state. In this case the 
statistical distribution at a given instant of time is not related to that in the 
past. In other words, the system randomly gets into the situation which we 
call a collision. Before the collision the ‘pair’ of particles consisted of non- 
interacting particles. Their meeting at the instant of time ¢ is a random act. 
Beginning from this instant and up to the instant ¢ +7, when the particles of 
the ‘pair’ undergo a subsequent collision, their motion is of a determinate 
character. Thus the derivation of Boltzmann’s equation is based on the as- 
sumption of time asymmetry and the difference between initial and final 
states. The behaviour of the system at the times ¢—7, and ¢ + 7 is essentially 
different. 

The derivation of Boltzmann’s equation presented above contains in ex- 
plicit form an expansion in terms of the small parameter 1/v. 

Making use of the method developed for the expansion in terms of the 
parameter l/v, one can obtain the kinetic equation for gases more dense than 
those described by Boltzmann’s equation. However, this derivation is asso- 
ciated with cumbersome calculations and we shall not present it*. 


* See the monograph of N.N.Bogoliubov quoted before, and also G.E.Uhlenbeck and 
G.Ford, Lectures in statistical mechanics (American Mathematical Society, Providence, 
1963). 








90 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


§17. The generalized transport equation and the properties of summational 
(additive) invariants 


From Boltzmann’s equation it is possible to obtain a number of important 
general consequences, which are not associated with finding the explicit form 


of the distribution function. 

Suppose that a gas as a whole performs a motion with mean velocity u. It 
turns out that by means of Boltzmann’s equation one can find the general 
equation which must be satisfied by the mean value of an arbitrary function 
of the velocity of relative motion 


Waa te (17.1) 
i I 1 


i.e. of the function Y(V;— u), where i= x, y, z. The mean value of this func- 
tion is 


yf dv 
Ur, )=——-—= fyfa, (17.2) 


[fav 


where N, the total number of particles per unit volume, is, generally speaking, 
a function of coordinates and time, since the distribution function f depends 
on these variables. We set up the derivative 


Newt = fyc; Lav. (17.3) 


Making use of Boltzmann’s equation, one can write (17.3) in the form 


OVV 
V Non 


=- fuv En Lav — fw V)— Fk ae mtv tf Vw. (17.4) 


§17 GENERALIZED TRANSPORT EQUATION 91 


We transform the integrals involved in (17.4). Obviously, we have 
of ð ð 
ye =. J 
fv a dv = = T s | vos dy = TF: N,V), 


Fk af 4 ay 
jez SNE at 
Jy m Bu, °” m alt bef aM Tiy Ov, dy 


ov ate ay 
m E fra nee sw (2h 


It is assumed here that the distribution function f decreases sufficiently rapid- 
ly with increasing absolute value of the velocity, that (Wf) tends to zero as 
lul => co, 


Thus (17.4) takes the form 





aN, əy (24) 
U FTI TF ee a Be | 
US tNSt NOK) - N (5 f uray. (17.5) 


This equation is called Enskog’s equation, or the generalized transport equa- 
tion. It is clear that in its general form, for an arbitrary form of the function 
W, Enskog’s equation is no simpler than Boltzmann’s equation. If, however, 
y is one of the additive constants of motion, i.e. 


Y( A) =m, mV;, km(V)? , (17.6) 


then the generalized transport equation is substantially simplified. 

To see this, let us consider an important general property of the integral 
standing on the right-hand side of the generalized transport equation. We 
denote it by G: 


G= [uray = fyolv -vii Aif3- Mi) av dv, dQ. 


We symmetrize this expression, making use of the fact that under the re- 
placement of arguments 


v>V, ; vy >v 


92 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
the integral G does not change 
Gi = f volv =Y ofa — fh) dv, dvdQ=G. 


On the other hand, the pair of variables can be exchanged in such a way as to 
exchange the velocities before and after collision: 


V, V] > V>, V3; V2, Y3 > V, V}. 
Then, obviously, we have 

lv — v;l =Iv¥2—v3]l 
and we again obtain 

G= [valva = val ofi- fof] dv2dv3dQ=—G} , 

G3 = [¥slv3—vol olif- fsf] dv3dv2dQ=-G], 
Summing the quantities G; we can write them in the symmetric form 
2D Gj=4(G, + Gy + G3 + G4) = 

=4 f [Y2 + Y3- V1 =V] welf2f3—- Mh] dv, dv da. (17.7) 


Making use of the symmetric representation (17.7), the generalized trans- 
port equation can be rewritten in the form 





-3N aw QO —. ER (IVN a 
omen No Ola” a, (74) = 
=-4[ [v+ V3- Y- V4] wef- ff) dv dv, da. (17.8) 


We now assume that w is one of the additive constants of motion. 
The following equalities hold in a collision between particles: 





SE i a 


| mı + m,=const, 
Hi m,V, + mzV= const , 


4m,v? + 4m v2 = const. 


§18 EQUATIONS OF MOTION OF A CONTINUOUS MEDIUM 93 
For these five additive constants of the motion 
Yat Y= Y ty, 
so that 
2 G;=0. 
Thus G is an additive (summational) integral invariant of collisions. For the 


five additive integrals the generalized transport equation is simplified and as- 
sumes the form 


7 oN aw 
Y ðt eN ðt 





I a ei a) = 
+N 5 O-N (= =0. (17.9) 


It is clear that eq. (17.9) is substantially simpler than the general equation 
(17.8), since it does not contain any non-linear terms describing pair colli- 
sions. For this reason eq. (17.9) is more general and is actually applicable 
even when Boltzmann’s equation loses its validity. 


§18. The equations of motion of a continuous medium 


By means of eq. (17.9) it is possible to find the macroscopic laws of mo- 
mentum and energy density conservation. It is obvious that the macroscopic 
density of a homogeneous substance can be written in the form 


p=mN; w=m. 
Then (17.9) gives immediately 


dp, a = 
ty Bag a Oe (18.1) 


Eq. (18.1) represents the mass conservation law in a moving medium. 
One often has to deal with a many-component gas consisting of molecules 
of masses m. In this case it is necessary to introduce the ideas of the density 





94 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS 
and velocity of the ath component: 
p*=N%m, =m, fro dv , 
a aia) 
us = fos dy, 
where a= 1, 2, ..., 7, and the mean mass density and velocity are 


p=2y =D mf frdv, 


l 
u=5 D pupa Da m” {ves dv : 


For the quantity ðp%/ðt we can write the two expressions: 


œ a 
opr = me (2 dy = — m“ foe dv = 
ap*us 
ð k 
= — m" f vgs av = Ox, 


or 


O_O ok. a THR = 
Wee ae f og- uprav ap” uy f fav 


OL 
= iy OE eee 


Here we have introduced the vector 


ig m* fok- uf“ dv S 


Ch. 3 


(18.2) 


(18.3) 


(18.4) 


(18.5) 


(18.6) 


(18.7) 


(18.8) 


called the diffusion flux of the ath component. The vector ig represents the 
flux of the ath component with respect to the mean mass velocity up (see 
(7.6)). Formula (18.7) represents the mass conservation law written for a 


many-component mixture. 


Summing the expressions (18.8) for different components and taking into 


§18 EQUATIONS OF MOTION OF A CONTINUOUS MEDIUM 95 
account that 
Digs Dy m“ fog- u) f“ dv = 
=) m“ f vas" dv ~ uy D m“ f fav =0, 
we find the total mass conservation law 


OPO. 
SFT arp k: (18.9) 


We now pass on to the formulation of the momentum conservation law, 
writing it first for a one-component system. 

To obtain the momentum conservation law, one can substitute the value 
y = mV; into (17.9). However, it is more convenient to reproduce to a cer- 
tain extent the derivation of the formula, calculating the derivative of the 
momentum vector per unit volume 


pu;= Nm f vif dv . (18.10) 


Here we shall confine ourselves to the case of a one-component gas. Using 
Boltzmann’s equation, and taking into account the general property of the 
summational integral invariant G, we obtain 


ð z OW aye 
FTA = Nm forat dv 


le Q Ays 
= -Nm f voy seedy — mN f vim IAN = 


d. 
=—Nm f@- uj) (Ug — Up) A + mNujuy Sx dv — 


3 F i 
N y CED iN Slang Vv. 


Here we have formed the differences (v;—u;) and (vz — ux) and integrated the 
last term by parts. We transform the first integral to the derivative of the 





96 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


integral 


fo- uj) (Ug — Ux) = dv = 





ð ð 
Sfar [(v;— u) (vp — up f] dv liaz (v;— u) (Vp — ug) dv = 
=? )fidv + 
ax, JC Uj) (Up — uk 
ðu; dük 
+SS [or- DETA + (v;— wa] dy = 
a Ou; 
= ax; J Uj) (Up — Up) fdv + ORE {fox — Uy) dv + 


Our a 
+ a, Ser uj) dv = ax, JC Uj) (Up — up) FV . 


By definition of the mean, the integrals ff(v;— uj) dv = 0. Furthermore, we 
have 


ðv; 
Bu, k 0 


We introduce the force vector per unit volume 

fp = FN (18.11) 
and the tensor o;;, called the stress tensor 

Oik = MN | (v;— u) Up- ug) Fav = mN@;— u) O Uy). (18.12) 


By definition the tensor o;, is symmetric, Oj = 0,;. 
Then we finally obtain 


EE 28 
or T Ox, ax, k tis (18.13) 


§18 EQUATIONS OF MOTION OF A CONTINUOUS MEDIUM 97 


or, making use of (18.9), 


(ee e dojk 
=—+ups—] =- Ffy. 18.14 
PN ar k ax, Ox, Ti CIS) 
Eq. (18.14) represents the macroscopic equation of motion of a gaseous 
medium. 

Finally, the law of conservation of the macroscopic energy of a gaseous 
medium can be obtained in an analogous way. The mean energy per unit 
volume of a gas performing a macroscopic motion with mean velocity u can 
be written in the form 


E=4mN f wu)? fav. (18.15) 
Differentiating with respect to time, we find, analogous to a previous expres- 


sion 


JA S UL ep 
ry bmn f (vj uj) ar dv = 


Fk af 4 
m Boe” 


of 


= —4mn f 0; — u)? vka : vg VMN fC; —uj)?- 


= —4(1, +h) mn. (18.16) 


We transform the two integrals separately. We have 


= f; — uj) vka H av= Sor - u)? (Vp — ug) pac Ay tug fC = u)? 2 dv= 





ð a 
= dx, fe u)? (vk up) f dv Mio [(v;- u)? (Uz = ux)] dv + 
ð a 
ar a e u)? fdv afi lug(v;— u)?] dv = 


ð ð 
= ax, JC uj)? (Ug — Uy) Sav + ax taf Ci uP f dy — 


a BF dened (24, + 2uj,£) Ou; A 
~ [Pig C dy = axe mN +2 fo; ¡j~ u)vgd v5 X 
_ ð 2G t QUE Ou; Oik duk 
OX, mN + ox, MN ng fio: ui) Vk Ox, dv 
ð (24, + 2uzF) Ou; Cik 
ax mN ° xp MN' 





98 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
Here we have made use of the definition (18.12) and introduced the notation 
ax = Amn f w- u)? Up- u) fav . (18.17) 


It is obvious that the vector qg represents the energy flux density vector. 
For the integral /, we have 





Fk af 
t ZiT -—u)2 —dv= 
i h zn fo u) avy dv 
=> — A 2 — ps — s s = 
m Či “D res 2a foi NOTE 
Finally, we find 
dvE__ a au; 
a OR (qx + Euk) — Oik Ox, (18.18) 


Relation (18.18) expresses the energy conservation law. The change of ener- 
gy in a unit volume is associated with the total energy flow from this volume 
qx + Eu, and the work done against internal forces (0;, 0u;/0x;). If the ener- 
gy of an ideal gas is expressed in terms of its temperature, then instead of 
(18.18) one can write 


Ou; 


Oe ws Si 
ax, (ak + Cy Tuk) — Oik ax, (18.19) 


C, = = 
v Or 


The set of equations (18.9), (18.14) and (18.19) represents the system of 
equations of a gas in the continuous-medium approximation. 

Although these equations are derived for an ideal gas, their region of appli- 
cability is much larger. They express the general conservation laws for a con- 
tinuous medium, and in such a general form are applicable not only to rar- 
Í efied gases but also to liquids. However, for their actual use it is necessary to 
find the explicit form of the stress tensor o;, and of the energy flux vector 
qx. The latter in its turn requires that the distribution function f be known. 
| Below it will be shown that in a particular approximation the distribution 
i function and, correspondingly, also the quantities 0;, and q can be found by 
: 






integrating the Boltzmann equation for an ideal gas. 
For liquids one has to content oneself with empirical expressions of 0; 


and qx. 


==- 


>= ar 





me 


§19 LAWS OF INCREASE OF ENTROPY 99 


By means of analogous calculations, one can also easily obtain the mo- 
mentum and energy conservation laws for a mixture of ideal gases. 


§19. The laws of increase of entropy 


In statistical physics we dealt in detail with the law of increase of entropy. 
In §25 of Part III the principle of increase of entropy was established. It was 
shown that when the state of a closed system changes the entropy of the final 
state is always larger than that of the initial state. 

However, within the framework of statistical considerations it is impos- 
sible to establish how the transition from the initial to the final state is carried 
out. In kinetics, it turns out to be possible to investigate the character of 
change of entropy in greater detail and to show that the entropy of an ideal 
gas increases monotonically with time. 

Let us calculate the change of entropy per unit volume of a monatomic 
ideal gas, using for this the expression (25.6) of Part Il 


=—k {finfav. 


Differentiating s with respect to time and making use of Boltzmann’s equa- 
tion, we have 


OSLO) a 
Soak f fingav =—k fc ting Lav = 


u of 
=k fa Fin eg av KSO +Inf)idv— 


Fk af af 
k fa +Infy nov =k SO nAg YT 


-k fi +Inf)dv. 


It is easily seen that on integration the third term reduces to zero. We also 





100 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
transform the first term 


fa tnu av = 


ð “ a 
sore +Inpusdav- ffa {a +Inf)v,] dv = 


am o di 
= ay, A dv = — a 


Here j; = — fu, fin fdv is the entropy flux vector. 
Making use of the symmetry of the collision integral (17.7), we have 


ays 
as Vk _ e 
aes -k JIQ + Infyav = 


=—4k | n f+ nf, —Infy—In fy] f3- fif) ovre dVdv,d2.(19.1) 


Formula (19.1) defines the change of entropy in a given volume element of 
the gas. This change is associated, on the one hand, with the entropy flow 
carried by gas particles and, on the other hand, with the molecular collision 
processes characterized by the right-hand side of (19.1). We integrate (19.1) 
over the entire volume V of the closed system. We then obtain 


SS = 4k fin (ce) (fof — S1) Op dV dv, dQaV , (19.2) 


where S is the total entropy of the system S = fs dV. 
The integral 


pe ee 
ax, OY Sie = 0. 


since the flow at the boundary of a closed system is equal to zero. 
It is easily seen that the integrand of (19.2) is essentially positive. Indeed, 


hfz 


m (22) Urf3-f\] 20. (9-3) 


§20 EQUILIBRIUM AND LOCAL-EQUILIBRIUM DISTRIBUTIONS 101 


If fof3 > ffi, then the logarithm and the term in brackets have a positive sign; 
if f2f3 < ffi, then both are negative. When 3/5 = ff, the integrand reduces to 
zero. Since the integral of an essentially positive function is also positive, we 
see that in a closed system 


Tae Ole (19.4) 


Thus it is proved that the entropy of a closed system consisting of an ideal 
monatomic gas increases monotonically in time or is constant. We see that 
the kinetic theory establishes the character and details of the mechanism of 
entropy increase, relating it to intermolecular collisions. 


§ 20.Equilibrium and local-equilibrium distributions in an ideal gas 


When increasing, the entropy tends to a certain limit, so that 
=—=0 as t7o. (20.1) 


In reality, the entropy reaches its maximum value not for ¢>°%, but after 
the lapse of a certain relaxation time 7. 

The calculation of this relaxation time is a very complex problem. We shall 
find an approximate value for it below. 

It is clear that for (20.1) to be fulfilled when ¢ > 7, the equality sign must 
hold in formula (19.3), i.e. 


Ifas Ii: (20.2) 


Formula (20.2) shows that In fis an additive constant of the motion. For col- 
lisions between two particles there are five and only five additive constants of 
the motion*: the integrals of mass, momentum and energy. 


* Note. See, for example, A.Sommerteld, Thermodynamics and statistical mechanics 
(Vol. 5 of Lectures on theoretical physics), (Academic Press, New York, 1956). 

In the more general case of the three-body problem there are eight additive inte- 
rals: those of energy, momentum, angular momentum and mass; see E.T.Whittaker, 
A treatise on analytical dynamics (Cambridge University Press, Cambridge, 1927). Ap- 
parently, this last theorem is of a general character. 





102 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Chis 
Hence In f must be a linear function of these quantities, 
In f= am + bv, + Zemo? a (20.3) 


We note that the solution of the functional equation (20.2) presented in sta- 
tistical physics was based on the assumption that f= f(v?). The derivation 
given here is free from this assumption. 

The constants a, b; and c can be expressed in terms of the number of par- 
ticles M per cm3, the mean macroscopic velocity u;, and the mean energy of 
the monatomic gas in an equilibrium state. In particular, if the gas as a whole 
is at rest and u;= 0, then 


{fav=n, (20.4) 
fosa =0, (20.5) 
N f mu} f dv = 3NKT. (20.6) 


Whence we find Maxwell’s equilibrium distribution 





fOM=N ( a) * exp (—mv2/2kT) . (20.7) 


In an equilibrium gas, the temperature T and density NV have constant 
values throughout the volume of the gas, and no macroscopic motion occurs 
in it. 

If, however, the temperature and density depend on coordinates and time, 
and the gas is moving with mean velocity u;, then the collision integral of the 
Boltzmann equation reduces to zero if we set 


fO=N (z) : exp [—m(v —u)2/2kT] , (20.8) 


where N and T are functions of coordinates and time. For brevity we put 


fO = exp (a + Bw; + w?) , (20.9) 


} 
a=N (3) exp (— mu2/2kT) , (20.10) 


§20 EQUILIBRIUM AND LOCAL-EQUILIBRIUM DISTRIBUTIONS 103 
B= mu;/kT , (20.11) 
y= m/2kT. (20.12) 


The distribution f represents the solution of Boltzmann’s equation, if it 
reduces not only the right-hand side but also the left-hand side of this equa- 
tion to zero. For this a, 6; and y must satisfy the conditions given below. 

The distribution (20.8) or (20.9) is called the local Maxwell distribution. 
Substituting (20.9) into the Boltzmann equation and taking into account that 
its left-hand side must be equal to zero for any value of v;, we set the coeffi- 
cients of different powers of v; separately equal to zero. We then obtain the 
system of equations 


EACLE (20.13) 
wat ays Meno, (20.14) 
5y +} (Sea) <0, (20.15) 
a . (20.16) 


Thus equations (20.13)—(20.16) impose a restriction upon the mean velocity 
of motion u; consistent with the local Maxwell distribution. Formula (20.6) 
shows that the gas temperature T must be constant in space. However, the 
temperature of the gas as a whole can change in time. 
Differentiating (20.15) with respect to the coordinates and taking into ac- 
count (20.16), we have 
7B, a8, 


aaa ot, a 
OX,OX; Ox;OX, 4 


Whence it follows that 


378; 
ERT 0. (20.17) 





104 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
The solution of eq. (20.17) is 

u;=a,(t) + bf) x; 
or 

u;=0. (20.18) 


The coefficients bj can be found by substituting (20.18) into (20.13). 
We then obtain 


m OT 
FKT Ot - 





bj= -bjii bü” 
Hence, finally, 
dT 


m 
u;=a;(t) + Dine pj) + aya eo 


or, in vector form, 





u=a(f) + [ox le rae (20.19) 


Thus any translational and rotational (with angular velocity w) motion of the 
gas as a whole, and radial motion whose velocity is defined by the last term of 
(20.19), are consistent with the local Maxwell distribution. 

Condition (20.14), which we have not used, contains forces whose form 
imposes definite bounds upon the coefficients a;and b;j. If the gas is confined 
in a motionless container with impermeable walls, then stationary motion of 
the type (20.19) cannot occur in it. This means that in (20.18) one must put 
u;= 0. Setting B;= 0 in (20.14), we find 


ða Fi 
Bec Am °° 


If the external forces have a potential, then F;= —0U/dx;and for a we have 


a=—U/kT+ const. 


20 EQUILIBRIUM AND LOCAL-EQUILIBRIUM DISTRIBUTIONS LOS 


an 


Then the normalized equilibrium distribution assumes the form 


3 
fM—B) = N (5727) * e= mu? /2kT @-U/KT | (20.20) 
Thus we have arrived at the natural result; the Maxwell—Boltzmann equili- 
brium distribution (20.6) is established in the gas rather than the Maxwell 
local distribution (20.8). 

The establishment of the Maxwell—Boltzmann equilibrium distribution is 
associated with the establishment of space and velocity distributions. The 
relaxation time for a process in velocity space is of the order of magnitude of 
T~ X/0, where A is the mean free path. Let an arbitrary distribution of parti- 
cles in space, y(r, V, £), be defined at the initial instant of time ¢ = 0. In a time 
of the order of 7 the velocity distribution of the molecules at each point of 
space approaches a local Maxwell distribution, so that 


ar, v, )> fOr, v, t). 


The density of the gas, N, and its temperature, 7, do not have time to take on 
equilibrium values throughout the gas,and the macroscopic motion of its parts 
(if it took place at the initial instant of time) does not have time to be 
damped. In order to stress this fact we have written the parameters r and fin 
f), in addition to the argument v. The variation in time of the variables N, 
T and U is described by macroscopic variables and is characterized by the 
macroscopic time Taco. The relaxation time 7 macro is Of the order of L/c, 
where L is the size of the macroscopic system, and c is the velocity of propa- 
gation of perturbations in the gas. As will be shown in $26, this is none other 
than the velocity of sound. Thus 
POS MEE) RG basen. 

It should be stressed that, as distinct from the Maxwell—Boltzmann dis- 
tribution, the local Maxwell distribution is never exact but only describes ap- 
proximately the velocity distribution in a limited volume of the gas. 

The local Maxwell distribution is an exact solution of the Boltzmann equa- 
tion only for a velocity as given by formula (20.19) and a density satisfying 
eq. (20.14). For other values of u, T and N(r), f represents an approximate 
solution of the Boltzmann equation (see the next section), valid for time in- 
tervals 7 in the course of which the macroscopic quantities u, T and N do not 
have time to change and can be considered simply as constants. 








106 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


§21. The general theory of the solution of Boltzmann’s equation 


The results obtained in §17—19 were not concerned with finding explicit 
solutions of Boltzmann’s equation. As we have stressed, solving Boltzmann’s 
equation is a matter of considerable difficulty, both of a mathematical and a 
physical character. These difficulties are associated with the fact that the 
actual form of the function o(vrel, a) for molecular collisions is unknown, and 
hence the very problem of integrating Boltzmann’s equation is somewhat in- 
determinate. In practice one always considers certain simplified models for 
which the intermolecular interaction can be represented by one or other suf- 
ficiently simple laws depending on the distance between the particles. 

The simplest and most often considered models are those of particles 
which are rigid spheres or of particles interacting according to a law of the 
type 1/r”. These interaction laws are rather arbitrary and their choice is de- 
termined mainly by the simplicity of the corresponding formulae for the scat- 
tering cross section. Use of these models corresponds to the transition from a 
real gas system to a particular model. However, solving Boltzmann’s equation 
even for a model still has considerable mathematical difficulties associated 
with it. At present a number of methods have been devised for its solution. 
We shall dwell only on the most important of them. 

The most effective method of finding the general solution of Boltzmann’s 
equation is the method of moments. 

Moments are functions of the form 


mMO= { fav, (21.1) 
MY = fusav, (21.2) 
MP = for, fav , (21.3) 
MY = Uv,U,f dv ; (21.4) 
MM) = fv; uy fav. (21.5) 


Such important quantities as density, the mean velocity of particle flux, mo- 


§21 SOLUTION OF BOLTZMANN'S EQUATION 
mentum and energy fluxes are also moments. For example, 
p=NmM©) , 


u;= frifav = Mi) x 


j;=m fer u) fdv = m(M (1) — uM) > 


Oik mN(MO)— uu, MO) 3 
2 
q;=4mN Sop?sfav = imNMO 3 


The idea of the method of moments is as follows. 


107 


(21.6) 
(21.7) 
(21.8) 
(21.9) 


(21.10) 


We write the solution of Boltzmann’s equation in the form of a series in 
terms of orthogonal polynomials. For such polynomials it is natural to choose 
the Hermite—Sonine polynomials (§10 of Part V), which are conveniently 


written in the form 


HO -EDY amf 
ijkl... fo Ov,du;... ~ 
Badi 


Thus 


The Hermite—Sonine polynomials are orthogonal with weight f: 


fFOHH™ dv=0 for m#m. 
We write the distribution function in the form of a series 


F=f +c VHD DHD.) 


(21.11) 


(21.12) 





108 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Gh; 


with the coefficients ce“), c% depending on coordinates and time. A simple 
calculation shows that the coefficients of this series may be expressed in terms 
of moments. Making use of the orthogonality of the Hermite polynomials, it 
is easily shown that 
ee o (21.13) 
“Gaia its NEF" apa 





Multiplying Boltzmann’s equation by the Hermite polynomials H”) and in- 
tegrating it over velocities, we have (in the absence of external forces): 


(m) (Of of ~ ( y0n) 
fu (X +p ae dv = fH ray. (21.14) 


We then substitute (21.14) into the expansion (21.12) and, making use of the 
orthogonality conditions, arrive at the equations for the coefficients un) 
Since the coefficients oon) can be expressed in terms of moments M; Cn) this 
system can be written in ‘the form of an infinite set of differential equations. 
These equations are of the form 


am) 
ieee) eae ð (m) = 
he +2 ae Me. Sopp, - „Idv. (21.15) 


For a particular interaction model, where the intermolecular interaction is ap- 
proximated by repulsive forces inversely proportional to the fifth power of 
scattering (the so-called Maxwellian molecules) the infinite set is cut off and 
reduces to a finite number of equations. However, for more realistic models 
such a simplification does not arise. Nevertheless, the importance of the 
method of moments lies in the fact that, in principle, it allows one to find a 
closed system of equations with respect to macroscopic quantities (moments), 
which is equivalent to Boltzmann’s equation. The exact solution of this sys- 
tem would be equivalent to the exact solution of Boltzmann’s equation. For 
actual calculations one has to approximate the distribution function by a 
finite number of terms in the expansion (21.12). 

Usually one restricts oneself to terms of the third order. In this case it is 
most convenient to keep only those moments which have a direct meaning. 
There are the first thirteen moments — M®, Mi”, MY, mQ- in terms of 
which the density, momentum and energy fluxes are expressed. In this, the 
so-called thirteen-moment approximation, the distribution function is of the 


mn 
N 


SOLUTION OF BOLTZMANN'S EQUATION 109 


form 





N [( H sik UME _ TRUE (z J) ( mr) | ante) 
2N (KT)2 mN \kT SkT 
Thus expansion (21.16) involves the coefficients o; and qg which depend, in 
general, on coordinates and time. 

Substituting the approximate value of faccording to formula (21.16) into 
(21.14), one can arrive at a system of equations for these coefficients. A merit 
of this system is the fact that the unknown coefficients are directly measura- 
ble quantities. 

To increase the degree of accuracy, one can increase the number of terms 
of the series which are retained in the expression approximating the distribu- 
tion function f*. We cannot dwell on these cumbersome calculations, the 
more so as they all refer to gas models with arbitrarily defined scattering cross 
sections O(V,,}; &). 

Boltzmann’s equation permits essential simplification in two important 
cases: (1) if we are interested in changes of state of the gas in time intervals 
At > 7, (2) if the gas as a whole is in a state close to equilibrium. In the first 
case it can be assumed that at a time z < Az a velocity distribution close to 
the local Maxwell distribution f is established at each point of the gas. 
Then for times ¢ > Ar one can try to find a solution of Boltzmann’s equation 
in the form 


fr, v, N= fO, r, Oo [1 + yr, v, D], (21.17) 


where yf) is a small change in the function f®, i.e. y <1. 

The function y describes the development of the gas for times which are 
large in comparison with 7. This method (called the Chapman—Enskog meth- 
od) allows one to describe the macroscopic behaviour of the gas, for example 
to calculate the momentum flux or heat flux in the gas. 

If the gas is in a state close to an equilibrium state, then one can set 


fa, Vv, 0 =f) [1 +r, v, D], (21.18) 


where f(™)(v) is an equilibrium Maxwell distribution, and y <1. Such an ap- 
proximation makes sense if small perturbations are applied to a gas which was 
initially in equilibrium 


*M.Kogan, Dynamika razrezhonnovo gaza (Dynamics of a rarefied gas) (Nauka, 
Moscow, 1967). 





110 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


In this section we shall confine ourselves to an exposition of the general 
theory. Examples of the application of the general theory will be given below. 

In the two cases mentioned Boltzmann’s equation is linearized. We shall 
begin with the first case. 

In substituting (21.17) into Boltzmann’s equation, one need retain only 
terms of the first order of small quantities. In this case, the velocity v;and the 
forces F; giving rise to a departure from a local equilibrium distribution are to 
be considered small quantities of the first order. Hence substituting (21.17) 
into the left-hand side of Boltzmann’s equation gives 


ap, af Fk af 3 
ðt Wk Oxy tm Ou, ` UES) 





In the collision integral one can write 
[ols —Lf,) SOVU to + 95) - FOAM + 9+ 9) = 
= SOK, + 93- 9-9) - (21.20) 
Hence, finally, we arrive at the equation 


ap, af Fk afO _ 


ar * Ox, m ðk 


=f fov fey + 3-9, -ay dQ. (21.21) 





The equation obtained is a linear non-homogeneous integro-differential equa- 
tion with respect to the function y(x;, Ux, t). 

In §22 we shall carry out the solution of this equation. Here we shall only 
point out an important property of the non-homogeneous integral equation 
(21.21). 

We write it in the abbreviated operator form 


Ly=A, (21.22) 


where A is a known function, and Z is a linear integral operator. 
In order to avoid cumbersome formulae, we write Ly in the symbolic form 


y= fwo, vı) (9, -—¥) dv, , (21.23) 


§21 SOLUTION OF BOLTZMANN'S EQUATION 111 
where 
w(v, vı) = wv; v). 


Multiplying (21.23) by an arbitrary function W(v), and integrating with 
respect to dv, we find 


[Vo fyo) av = f yowo, vi) [9107 —A)] dv dv, . (21.24) 
Furthermore, we can write 
Juon Lee av, = fyo) wor v) [o) — 9 (V1 dv dv, = 
= — fW) wo, vi) [91(¥)) —A(V)] dv dv; . (21.25) 

Comparing (21.24) and (21.25), we rewrite (21.24) in the symmetric form 

Svig av =4 f @ -v U- v) w avy av, (21.26) 
Exchanging y and y, we find 

fuio dv = fvty dv. (21.27) 


Thus Z is a self-adjoint operator. In the particular case where y = y, (21.26) 
gives 


Seigav =4 f(e—y,Pwavav, >0, (21.28) 
since w(u, v;) > 0. 


Let us consider two functions; the function y which satisfies eq. (21.23) 
and at the same time the equation 


f yÎy dv = f yA dv (21.29) 
and the function Y which satisfies only the integral relation 
fyty av= [f yaya, (21.30) 


but is not the solution of eq. (21.23). 





112 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
It is then easily shown that the following inequality holds: 
fetvav> [yivay. (21.31) 


This inequality means that the solution of the linearized integral Boltzmann’s 
equation corresponds to a maximum of the integral feLy dv with respect to 
all functions satisfying the condition (21.30). 

To prove (21.31), we write the positive (by virtue of (21.25)) quantity 


fe-Wiwe—wWav = feivav + fyivav— foy av- 
— [vivav = foteav + fyiyav—2 fyivav>o, 


where we have made use of (21.27). 
Using (21.22) and (21.30), we find 


fe-WLe-wWav=f pivav—f vivav>o 


Whence follows the inequality (21.31) which was to be proved. 

The extreme properties of the solutions of the integral equation (21.21) 
make it possible to use ordinary variational methods in order to find them. 
Thus we choose y in the form of a linear combination of known functions g;: 


y= 27 a;gi. 


By choosing the coefficients a; in such a way that y has a maximum value, 
one can find a function sufficiently close to the true solution. 

The second case of the linearization of Boltzmann’s equation is obtained 
by substituting (21.18). Since the Maxwell distribution automatically satisfies 
Boltzmann’s equation, we at once obtain 


oy oy Fk ðY _ ) 
art kak m w) POD fer + 3-9-1] av, AQ. (21.32) 


We thus arrive at a homogeneous linear integro-differential equation of the 
form 


V io, (21.32') 


§22 EQUATIONS OF HYDRODYNAMICS 113 


Its solution can easily be found, if we know the solution of the homogeneous 
integral equation 


Lp =o | (21.33) 


where A; are the eigenvalues and gO 
malized) of the operator iby 
The solution of (21.32) can be written in the form of an expansion in 


terms of the system of functions 


the eigenfunctions (orthogonal and nor- 


y= D ay. (21.34) 


In this case it is assumed that the spectrum of the functions gp has a dis- 
crete character. An example of an application of this method will be given in 
§25 and §26. 

We note only that the first five eigenvalues of eq. (21.33) can be indicated 
straight away. Namely, since the functions 


y=l1, =v, y=4Amv2 


reduce the collision operator to zero and do not depend explicitly on x;, they 
are the eigenfunctions of eq. (21.33) which correspond to the eigenvalues of 
the operator L. 


§22. The equations of hydrodynamics. The viscosity and thermal conduc- 
tivity of gases 


We have seen that Boltzmann’s kinetic equation allows one to obtain, as a 
consequence, the laws of mechanics of continuous media. However, to actual- 
ly find the stress tensor o; a knowledge of the distribution function f is re- 
quired. 

We pass on to a calculation of the non-equilibrium distribution function 
for an ideal gas performing a macroscopic motion. 

We shall suppose that the velocity of the macroscopic motion of the gas, 
u, changes from point to point. However, we shall assume this change to be 
sufficiently slow, where a slow spatial change of the velocity u is understood 
to mean the following: Gas volumes of spatial extent of the order of several 
mean free path lengths can be considered to be moving with a common con- 
stant velocity. As was stated in §20, a local Maxwell distribution is estab- 








114 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


lished in such volumes. The velocities of motion of different gas volumes can 
be different, so that u = u(r, 7). 

We shall confine ourselves to isothermal modes of motion of the gas, so 
that the temperature is constant throughout the volume of the gas. Assuming 
that local equilibrium is established, we substitute the value of f into the 
expression for the stress tensor o;, and the heat flux qg. 

Formula (18.12) gives 


oik = mN f (v;- u) (Vg up) FO dv = 
= p(v;— u) Ug — Up) = NKTB ik= Põik - (22.1) 


Thus the stress tensor reduces to the normal pressure. Analogously, from 
(18.17) we find 


Ba OSs : 
att Bx “D=O, (22.2) 


In this approximation eqs. (18.9) and (18.14) for a continuous medium as- 
sume the form 


Ou; Ou; ap 
el || Ss 2 
p, ( ðt t uk al Ox; j (22:3) 


Eq. (22.3) represents Euler’s equation. As is well-known, (22.3) is the equa- 
tion of motion of an ideal fluid. 
The equation for the entropy per unit volume (19.1) can be written in the 


form 


aye 
Os k 
—+~—=0. (22.4) 


This last equation shows that when a fluid is moving its specific entropy re- 
mains constant, i.e. the process of displacement is of an adiabatic character. 

In this approximation a continuous medium can be considered as an ideal 
fluid with the equation of state 


p=NkT. (22.5) 


§22 EQUATIONS OF HYDRODYNAMICS 115 


The set of equations (22.3), (22.4) and (22.5) determines completely the mo- 
tion of a gas in the continuous medium approximation. In order to obtain the 
equations of hydrodynamics of a real (viscous) gas in the same approximation, 
we shall try to find the solution of Boltzmann’s equation by the Chapman— 
Enskog method of successive approximations. Namely, we shall set 


f=fOU +9), (22.6) 


where y < 1, and f is the local equilibrium distribution. 

For ease of calculation we shall consider the gas to be isothermal and in- 
compressible, external forces to be absent and the pressure to be constant. 
(To this there corresponds, for example, the motion of a gas between two 
plane surfaces, one moving and the other stationary.) 

The mean velocity, u;, of the gas can depend on coordinates as well as on 
time. Substituting (22.6) into Boltzmann’s equation and confining ourselves 
to quantities of the first order of magnitude we find 


—— + 


af afO _ 0) 
Ot Vers N IY), 


(22.7) 


where 


(yp) = Í, (Ov oly, + ¥3—¥,] dv, dQ. (22.8) 


On the left-hand side we have dropped terms proportional to y, since the de- 
rivatives of f0) are themselves quantities of the first order of magnitude. 

Let us calculate the derivatives on the left-hand side of (22.7), making use 
of expression (20.8) for f(0). Since N and T are constant quantities, we have 


af af _ m 0) Ou; m (0) du; 
On ax kT u) f! w gre “LF Uk Ox, 


For ðu;/ðt use can be made of the ideal-fluid approximation, i.e. by virtue of 
(22.3) one can write 


Ou; Ou; 


1 
m O 
ðt xk 





116 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
Hence, finally, 


OD aro ðu; 
a SE 5 = (0) aes 
“ar | Ok ax, pp; Uj) (Vk = Ug) f a (22.9) 


For an incompressible fluid du,/dx,; = 0. In this case, instead of (22.9) one 
can write 


afO) apo)» u; Uz 
ye +4 = Tp ViVe- 3V Si) a f®. (22.10) 


We denote by U;, the deformation rate tensor 


Ou; duk 
s, =*= — +——. 
iK oX 


Boltzmann’s equation then assumes the form 


m 
rel Ui v6.) Uf =f f v.01, +9,-9,—9] dv, a E 


The function y(v) must satisfy, in addition to eq. (22.11), a subsidiary condi- 
tion expressing the constancy of the number of particles and of the mo- 


mentum and energy of the gas as a whole, i.e. 
frav=frOav=n, 
mN { fVav =P=Nm frOvav = const , 
E=}mN f fV2dv =4mN [fOv2dv = const . 


Whence follow the equalities 


fi fOpdv=0, (22.12) 
Nm [fOvpav=0, (22.13) 
mn [ fv2gav=0. (22.14) 


It is easily seen that the fulfillment of these five equations is the necessary 


§22 EQUATIONS OF HYDRODYNAMICS 117 


condition for the existence of solutions of the non-homogeneous integral 
equation (22.11). Indeed, if the homogeneous integral equation /(y) = 0 has 
the solution y = x(v), then the non-homogeneous equation 


Ip) = Ag (22.15) 
has a solution only if the orthogonality conditions 
fx@Meav=6. (22.16) 


are fulfilled. Solutions of the homogeneous integral equation are, for exam- 
ple, the functions 


xı = e-a? ; X2 = V e-a? ; X3= V2e-0V? : 


It is clear that the relations (22.12)—(22.14) express the necessary orthogon- 
ality conditions. From the structure of the integral equation (22.11) it is seen 
that one has to try to find its solution in the form 


ðu; du 
Y= Wik c + at Vin Vix - (22.17) 


In substituting (22.17) into (22.11) the quantity U; on both sides of the 
equation will cancel. This means that the solution (22.17) is valid for all 
values of the deformation rate tensor, as is to be expected. 

The tensor Wig is symmetric. It can also be assumed that Y;;= 0 always, 
since for i= k, U;;= 0 and y= 0 always. Substituting (22.17) into (22.11), we 
find 


VV -4v5,,)= =f Vova VP+ yD- yD y davdo. (22.18) 


The left-hand side of this equation does not depend on the mean velocity and, 
consequently, the function Y; depends only on the components of the rela- 
tive velocity V;. 

It is Gbyious that a change of the tensor (V;V,—3 V6 ,,) (for example un- 
der rotation in velocity space) should not violate eq. (22.18). Hence it follows 
that for YW; one must write 


Vik = CV) (ViV—4V75jx), (22.19) 


118 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


satisfying the requirement mentioned. Here a(V) is a certain scalar function 
of the scalar argument V. The explicit form of this function is obtained from 
the solution of the integral equation (22.18). To actually obtain the solution, 
it is necessary to know the dependence of the cross section on velocity and 
angles. For real molecules, even monatomic ones, the form of this function is 
unknown. A general idea of the character of the solutions can be obtained by 
considering the simple, though unrealistic, case where ø is a function only of 
the scattering angle æ but not of the velocity. 
In this case (22.18) assumes the form 


m 
gr ViVe -3V 78x) = 
= (2) N fe MIT oy [y+ yO—yO_y,JaMav, . (22.20) 


It is easily seen that the integral term on the right possesses an important 
feature: if the function in the bracket is a polynomial, then the integral of 
this function is also a polynomial. 

For our simplified model we shall outline the general idea of calculations. 
If g=v,,10(@, Vre) is a function of the scattering angle only, we can write 


Oe (22) og iets Ure) _ (22)? o Fla). 


Ov m 





Here ø is the total cross section and F(a) = v,.0(@, U,¢))/ov. 
It is possible to present the solution of (22.20) in the form* 


ee Sil. 2 
Vik = 35'N ()' (ViV,.-4V76;x) , (22.21) 


where 


o =o fF sin? ada. 


Correspondingly, the distribution function in the first approximation takes 


* See G. Uhlenbeck and G. Ford, Lectures in statistical mechanics (American Mathe- 
matical Society, Providence, 1963). 





§22 EQUATIONS OF HYDRODYNAMICS 119 
the form 


I on NE 
fate ie (Gea) yg? VV 4V8 Ui] . e222) 


By means of this distribution function one can find an explicit expression for 
the stress tensor, which is just our goal. Substituting the distribution function 
from (22.22) into (18.12) and calculating the integral, we find 


2m [(2kT\* (i e) 
1a e 22.2 
Oik = — Põik + 30" ( i ) (as wey (22.23) 





According to (8.12), o;, is expressed in terms of the tensor U;, and viscosity 
n. 

Comparing (8.12) and (22.23), we arrive at the expression for the viscosity 
of an ideal gas (in our model o = o(@)): 


2m (2kT\% 
eS (== 22.2 
n Y ( ai ) . (22.24) 


In the so-called rigid-sphere model, in which it is assumed that the cross sec- 
tion is equal to the geometrical area of the cross section of the sphere, an 
analogous expression is obtained for the viscosity. Formula (22.24) is in qual- 
itative agreement with experimental data, although, of course, it cannot claim 
a quantitative meaning. The agreement of formula (22.24) with experiment is 
apparently associated with the weak dependence of the distribution function 
y on the law of intermolecular interaction. We note that the viscosity of the 
gas turns out to be independent of the density. 

In the general case, without specifying the form of the cross section, one 
can write for 7 


n= -mN fw; u) Uk- up fO Vig dv, 


where W;, is expressed by formula (22.19). 
Thus we have calculated the viscosity, the first of the kinetic coefficients 
to be calculated. 
Making use of expression (22.23) for o;,, and substituting it into (18.14), 
one can write 
Ou; du; du; 
oA e E (22.25) 





120 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


or, in vector form, 


pt =-Vp+ny?-u+f. (22.25') 


This represents the Navier—Stokes equation. We recall that it describes the 
motion of a viscous incompressible fluid and is applicable to the case of rela- 
tively rarefied gases as well as to the case of liquids. 

The previous consideration shows that the Navier—Stokes equation can be 
obtained theoretically for the case of sufficiently rarefied gases, and also that 
the viscosity 7 can then be calculated, at least to within a numerical coeffi- 
cient. We shall not dwell on analogous calculations for a compressible gas*. 

In a completely analogous way one can calculate the heat flow in a ther- 
mally non-uniform gas. 

Assuming that the temperature T of the gas changes from point to point, 
and that the gas performs no macroscopic motion, we shall again try to find 
the solution of Boltzmann’s equation by the Chapman—Enskog method in 
the form (22.6). Then in the local equilibrium distribution f©) we set the 
mean velocity u; equal to zero, so that 





FO = NŒ) lEn g e- mv?/2kT = euim )/kT y (22.26) 


Here the chemical potential u and the temperature T change from point to 
point, i.e. depend on the coordinates x,. In order that the Chapman—Enskog 
method may be used, it is necessary, however, to consider this change to be 
sufficiently slow. We shall formulate below the quantitative criterion: for a 
change to be considered slow. Introduction of the chemical potential u into 
the local equilibrium distribution is associated with the following simple con- 
sideration. In a non-isothermal gas the number of particles per cm3, as well as 
the temperature, varies from point to point. If, however, the gas is at rest, 
then the pressure in it must be constant. Otherwise no mechanical equilibrium 
would be possible, and a macroscopic motion would arise in the gas. 

If the local equilibrium distribution is written in the form (22.6) and the 
chemical potential u is assumed to be a function of the pressure and tempera- 
ture u= u(T, p), then u will depend on the coordinates only through the 
temperature, 


* See, for example, L.D.Landau and E.M.Lifshitz, Fluid mechanics (Pergamon Press, 
London, 1959). 


§22 EQUATIONS OF HYDRODYNAMICS 121 


The function f must satisfy eq. (22.7). Let us calculate its left-hand side, 
making use of (22.26). Obviously, we have 


af _ af aT _ T o 2 (E u 4 mu? he 
Vk Ox, Yk oT Ox, Ta oT kT2 OKT? 5 











2 
(0) Ss h—sT . mv* 3 
K E | kT KT tP 


oT h—kmv2 Om AS te 
(0) = =: p) 
OT rns eA °k ax, T (5 KT) I> (eo) 





where A is the heat content per molecule. Hence eq. (22.7) assumes the form 
(analogous to (23.11)) 


ol /5 mv? 0) ) 
=v; fi (3-2 oi ie ov rella + 93- 9- gļdv; dQ. 
TAEST J (22.28) 


The function y, as before, must satisfy the conditions (22.12)—(22.14). 
From the form of eq. (22.28) it is clear that one has to try to find its solu- 
tion in the form 


oT 
y= fea are (22.29) 


where the vector ¢, depends on the velocity V. 

Substituting (22.29) into (22.27) we see that the solution (22.29) is valid 
for all values of the temperature gradient 07/dx,, which drops out of the 
equation. We then obtain 


2 
-fV ¥(3- we) = f Pov elba + 2-6) —C] dv, d2. (22.30) 


Eq. (22.30) represents the equation for the determination of the vector ¢. 
The only vectorial quantity involved in (22.29) is the velocity vector v. This 
means that the direction of the vector v is the only specified direction in 
space. Hence the vector ¢ must be oriented in this direction, i.e. 


S=a(v)v, (22.31) 


ar 








122 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


where av) is a scalar function of the scalar argument v. The function a(v) 
must satisfy, in addition to the integral equation, the orthogonality condi- 
tions (22.12)—(22.14). 

The actual form of the scalar function œ depends on the form of the scat- 
tering cross section. If, in particular, the calculation is carried out for a model 
in which o depends only on angles, then on substituting from (22.30) it is 
easily seen that the polynomial 


=~ og (BE 5) n 
)=— AoT \ KT 2) \2kT 


satisfies eq. (22.30). 
Hence the solution of (22.28) is of the form 


2 + ar 

= po) h- -a 2 a) or 
F= i (Zr 2) ONT \2KT) Yk ax, | ° 
Knowing the distribution function, one can find the heat flux 


=J 2fdv =: =- (1 —— 
qk bm f vv fd 2 7 ( ) 3. . (22.32) 


Hence it follows that the thermal conductivity K is equal to 


_ Smk 2T) $ 
KEA = $ (22.33) 


It is interesting to note that the relation 


£- Ik = fc, (22.34) 


does not depend on the unknown quantity g'g. 

Formula (22.33), just as the formula for viscosity, is in qualitative agree- 
ment with experiment. In particular, it correctly gives the temperature de- 
pendence of the thermal conductivity. Formula (22.34) has quantitative sig- 
nificance and is in complete agreement with experiment. 


§23 RELAXATION TIME 123 
§23. Relaxation time 


Let us consider in somewhat more detail how a spatially uniform gas in the 
absence of a field of force gets into an equilibrium state. 

We confine ourselves to a gas in a state sufficiently close to an equilibrium 
state. Then, in accordance with what was stated in §21, we can set 


f=fO1 +9), 


where f0) is the equilibrium distribution function. 
Boltzmann’s equation assumes the form (21.32). In the absence of an ex- 
ternal field, for a uniform gas (21.32) is of the form 


ð 
E= foal toy- dv, dQ. (23.1) 


We can try to find the solution of this non-homogeneous integro-differential 
equation in the form 


AV, i) = A;Wv) exp (— t/7(v)) - (23.2) 


A linear integral equation is then obtained for W(v). To solve it, it is neces- 
sary to define the function o(v; a). Thus we see that the time dependence is 
expressed by a series (23.2) and is defined by an infinite number of quantities 
Ti. The convergence of this series and the existence of the set of discrete 
quantities 7; have so far not been sufficiently fully investigated. However, the 
meaning of the series (23.2) and of the quantities 7; is quite clear. Equilibrium 
distributions are established in relaxation times 7; which are different in dif- 
ferent velocity intervals. 

This means that the Maxwell velocity distributions in different regions of 
velocity space will be established in different times. It is natural that different 
mean characteristics of the gas, for example the mean velocity or mean ener- 
gy, will have their own relaxation periods. 

If, however, in a certain rough approximation we disregard this fact and 
retain only one term of the series, writing 


f=fOU + pe), (23.3) 
then for y we obtain 


MAS (23.4) 





124 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


To an order of magnitude, 7 can be estimated as 


1 we 
7 ~ ND- (23.5) 
The factor N arises from the definition of the mean. Furthermore, with the 


same degree of accuracy we can write 
P= FOTO. 
In this approximation, the kinetic equation assumes the form 


_ -(0) 
ee (23.6) 


This last equation shows that the rate at which the fs approach equilibrium 
is greater, the greater the departure from the equilibrium distribution. For 
sufficiently small departures from equilibrium, the law of change, ðf/ðt, in 
the form (23.6) gives the character of the relaxation process with an accuracy 
sufficient for a number of practical applications. This is apparently associated 
with the fact that the function f™) together with the corresponding weight 
factor has a rather sharp maximum. Hence the major contribution to the re- 
laxation process is given by particles with a velocity close to the mean 
velocity. 

We shall see in what follows that this holds in particular for a gas of 
fermions. 

As well as the kinetic equation (23.6), use if often made of the relaxation 
approximation for obtaining the distribution function in the general case of a 
spatially non-uniform gas in a force field. Namely, the kinetic equation is 
written in the form 





F, 0 
Se snp Ee Oy Seeds (23.7) 
t OX, m dvg Ti 
Here 7 is considered a function of coordinates, since the gas density p changes 
from point to point. 

From the aforesaid it is clear that in this approximation the kinetic equa- 
tion has only a qualitative meaning. Nevertheless, for small departures from 
equilibrium, the kinetic equation can often be brought to the form (23.7) 
under certain assumptions on the character of the function 7. In particular, 
this holds for spatially uniform gases in an external force field. 


§23 RELAXATION TIME 125 


To illustrate the degree of accuracy of the relaxation approximation, we 
shall calculate, in this approximation, the viscosity of an incompressible gas. 
In the relaxation approximation we have 


af __f-s 
eae a (23.8) 





For small departures from an equilibrium state, one can substitute expression 
(22.10) into the left-hand side of (23.8), so that 


f-fi 
-ZE (ViVe -4V 8 ix) Uig fo = 25, 





Hence 


mT 


IST ep Vi V,—4V2 8 in) Uikfo - (23.9) 


Comparing this expression with (22.22), we see that they turn out to be iden- 
tical if we set 


5 1 l 


where / = 1/N0 otis the mean free path. 

Thus if the scattering cross section is considered to be independent of the 
velocity, the relaxation approximation gives the same result as the exact solu- 
tion of the linearized Boltzmann’s equation. 

If the cross section depends on the velocity, then the difference in the ex- 
pression for the perturbed distribution function will amount to a numerical 
coefficient. Correspondingly, the numerical coefficients in the expression for 
the viscosity will be different. Since the form of the functional dependence 
on velocity and angles of the cross section for molecular collisions is not well 
known, there is no certitude as to the values of the numerical coefficients in 
the formulae for the kinetic coefficients. Therefore the relaxation approxima- 
tion, which considerably simplifies the calculations, yields in essence an equal- 
ly accurate solution of the problem. 


— 


—~— = 


Pee eae 
2 “See 


126 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


§24. The diffusion of an admixture of a gas of light particles into a gas of 
heavy particles 


A very important case, allowing substantial simplification of Boltzmann’s 
equation, is the diffusion of gas particles of low mass into a basic gas of heavy 
particles. 

We shall assume that the concentration of the admixture to the basic gas is 
small, i.e. that the number of gas particles of low mass per unit volume n < N, 
where N is the number density of the heavy gas particles. The mass of the 
light gas particles, m, will be considered small in comparison with the mass M 
of the particles of the basic gas. Motion of the admixture can occur under the 
action of a concentration difference (diffusion) or temperature difference 
(thermal diffusion). 

We shall consider the number of particles of the basic gas per unit volume 
to be constant throughout the gas. The number of particles of admixture per 
unit volume varies along a certain direction which we shall choose as the x- 
axis. We shall assume the state of the system to be stationary, and shall not 
consider the action of an external force field. We shall write the kinetic equa- 
tion for the distribution function of the light gas particles. 

The distribution function varies only in the direction of the x-axis, so that 
it can be written in the form f(p,9, x), where p is the momentum of the par- 
ticle, and 8 is the angle between the momentum vector and the x-axis. Since 
the number of particles of the admixture is small, their collisions with one 
another can be disregarded. Hence in the collision integral one need only 
retain the term taking into account collisions between the particles of the ad- 
mixture and the particles of the basic gas. Collisions between light and heavy 
particles can be considered completely elastic, and the velocity of the motion 
of the light particles can be considered large in comparison with the velocity 
of motion of the heavy molecules of the basic gas. We shall assume the latter 
to be at rest and set 


Vel% Y, 


where v is the velocity of motion of the particles of the admixture. Corres- 
pondingly, o(V ej &) ~ o(v, @). 
The collision integral takes the following form 


I= f P19 ret a) (Ff — Ff) dp, da~ vN | ov, a)(f'—fydQ. (24.1) 


§ 24 DIFFUSION OF GASES OF LIGHT AND HEAVY PARTICLES 127 


Here F denotes the distribution function of the molecules of the basic gas, 
which is equivalent to the functions f} and f} in the notation of the preceding 
section. f’ denotes the function fy, and N= fFdp, is the total number of 
molecules of basic gas per unit volume. 

Since the particles of the admixture undergo only completely elastic col- 
lisions, their momenta p’ and p respectively before and after a collision have 
one and the same absolute value; the process of collision is accompanied by a 
change in the direction of motion, so that 


fF (p', x) = falp, 9", x) = f(8', x) , 
KP, x) = fp, 9, x) = (8, x). 


For brevity we shall drop the argument p. Physically this means that we shall 
seek the dependence of the distribution of the particles of the admixture on 
the x-coordinate and on the direction of motion with respect to the x-axis, 
for a given absolute value of the momentum. As a result, / can be written in 
the form 


[= oN f o(p/m, a) [f(0', x) —f(@,x)] dQ. 


The kinetic equation can correspondingly be written as 
P cos Hee! =v | op/m, a) [M0',x)—f0,x)] 2. (24.2) 


As distinct from eq. (14.11). the kinetic equation found for the distribu- 
tion function of the particles of the admixture is a linear integro-differential 
equation. Its approximate solution can be found in the form of an expansion 
in powers of a small parameter, in the given case the ratio v,/v. This method 
of solving the kinetic equation is called the Lorentz method. If the concen- 
tration difference or temperature difference giving rise to a systematic motion 
of the admixture along the x-axis is sufficiently small (see below), then the 
ordered motion will be superposed on the random motion. On the average it 
can be assumed that the velocity of the ordered motion, v,., is small in com- 
parison with the velocity of the random motion, v. The departure of the sys- 
tem from an equilibrium state, in which there is total isotropy of velocities, 
will be small. Hence we shall seek the solution of eq. (24.2) in the form 


S(O, x) © fox) + vy f(x) +... © fo + v cos fy j (24.3) 











128 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Here fo(x) is the equilibrium distribution function at point x (the argument p 
is omitted for brevity), i.e. 


mi 2 A 
Sop, x) = fo = n(x) ——— eP I 2mkT ; 
0 Q (2nkT,)3 


where n(x) is the number of particles of admixture per unit volume at the 
point x, and f} ()v cosô is the small correction to the statistical distribution 
which is sought. 

The quantity v,/v or cos@, characterizing the degree of anisotropy of the 
distribution function, is proportional to the value of the concentration differ- 
ence or temperature difference giving rise to this anisotropy. Hence the ex- 
pansion (24.3) represents, in essence, an expansion in a series of powers of the 
small anisotropy of the distribution function. 

Substituting expansion (24.3) into (24.2), we note first of all that the 
equilibrium distribution function reduces the collision integral to zero. Thus 
the quantity 


[= vn f ov, a) f,(x) [cos 6’ — cos 8] dQ = 
= v?Nf (=) f o(v, æ) [cos 8" — cos 0] sin ada d£ . 


will stand on the right-hand side of (24.2). On the left-hand side we have 


m 


ma N Be OR rae 


Px fo (=) 2 afi afo 
The term containing the derivative 0f/0x is dropped, since it is small in com- 
parison with the term retained. Thus we obtain 


ð. 
cos 0 h = wNT (x) f oC, a) [cos 8" — cos 0] dQ. (24.4) 


For what follows it is necessary to relate the angles 0’ and 0 of the mo- 
menta of the particles with respect to the x-axis to the scattering angle a. For 
this we make use of the well-known formula of spherical trigonometry (eq. 
(1.6) of Part I) writing 


cos 0’ = cos@ cosa + sin 8 sina cosB , 


§24 DIFFUSION OF GASES OF LIGHT AND HEAVY PARTICLES 129 








Fig. V1.4. 


where B= Y; — >, and WY, and > are the azimuthal angles of the vectors of 
momenta p and p’ (fig. VI.4). 
Then we can write 
foo, a) [cos 0’ — cos 0 ] sin æ da df = 
= foo, a) cos 0 (cos &-— 1) sin ada dB + 
+ J ow, a) sin 0 sin & cos $ sina da dB = 
=— 2n cos fol, a) (1 —cosa)sin ada . 


In the last transformation it is taken into account that fcosß d8 = 0. Then 
(24.4) assumes the form 


afo A F , 
cos == —cos 0 27N [ {0.90 —cosa)sinade | fv. 


Hence it follows that the function f} to be determined is equal to 


1 Ho 


N= Wo, Ox’ Ges) 


where 


us 
Oy = 20 ff a(v, a) (1 — cos a) sin a da . 
0 





130 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Cis 


The quantity o, is called the transport scattering cross section. In the par- 
ticular case of a cross section independent of the scattering angle, o(v, œ) = 
o(v), we have 


Or, = O(V) an f (1 — cos a) sina da = 4za(v) . 
0 


If the colliding particles are considered rigid spheres of radii r} and rz, then 
On = nlr + 1). 
The quantity 


Ar = 1/Noy (24.6) 


is called the transport mean free path for the motion of light gas particles in a 
heavy gas. Its obvious meaning will be elucidated below. Introducing into 
(24.5) the definition of the mean free path, we find 


dfo) 
AO) = rua 1 (24.7) 


and 


Ap. x, 0) = fo(p, x) — Ay cos 8 a ia (24.8) 


Knowing the distribution function, one can find the mean flux of particles 
of the admixture in the direction defined (i.e. along the x-axis). By definition, 
the mean particle flux in the direction of the x-axis, ją, is equal to 


Je- fosfäp: 


Indeed, (v, f) gives the number of light particles of a given momentum passing 

through a cross section of 1 cm? per second in the direction of the x-axis, 

Integrating over all values of the momentum, we find the total number of par- 

ticles passing through an area of 1 cm? per second in the specified direction. 
Substituting the value of f from (24.8), we find 


; afo 
ihe =fov cos 0 fo(p, x) dp — Jorn cos? 6 x P- (24.9) 


§24 DIFFUSION OF GASES OF LIGHT AND HEACY PARTICLES 131 


In view of the isotropy of the distribution function fo(p, x) the first integral 
reduces to zero, so that 


af, 

A ) $ 

ix= -f VA, cos? 0 a p2dp sin dé dy . (24.10) 
Let us first consider the case where the gas temperature is constant, and 


along the x-axis the admixture concentration gradient is dc/dx, where 
c=n|(N +n) <n/N. Then the integration can be carried out directly, writing 


co 


2r 3 
f costo sino ao ay S- [wr fop? dp = 
0 0 


ð RUS 2 ana] Un doen v dc > 
E fae for? tp =— 3 Be aoa (24.11) 


Here the bar denotes the mean value over the equilibrium distribution, i.e. 


DY l v m \3 2 
SASS Str = p“l2mkt 
(c) n tea) fo ap (siz) Peo p* dp . 


pga = 


oH 





The quantity 


D=} (a alia 40,0) (24.12) 


is the diffusion coefficient. Finally, for the light gas particle flux in the heavy 
gas we obtain 


f de 
jp=-DNS. (24.13) 


If instead of the concentration c use is made of the number of particles n of 
light gas per unit volume, then 


F ðn 
pemi, (24.14) 


From the definition of the diffusion coefficient D it is clear that it is an 








132 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ghee 


essentially positive quantity. Hence the flux of particles of the light gas ad- 
mixture is always oriented in the direction of decreasing light gas concentra- 
tion. It is just this phenomenon which is called diffusion. 

If a constant concentration difference along the x-axis is somehow always 
maintained, then in the gas mixture there will be a stationary motion of the 
light component (admixture) in the direction from the higher to the lower 
concentration. If, on the contrary, the concentration drop is not artificially 
maintained, then the concentration will become levelled and the composition 
of the mixture will become constant throughout the container. 

We have restricted ourselves to the case of diffusion of an admixture of 
light gas particles in a gas of heavy particles. Expressions for the diffusion of a 
heavy gas particle admixture in a gas of light particles can be calculated in an 
analogous way. We shall not dwell on this here. 

Let us now consider in more detail the meaning of the mean free path à,- 

If a light particle is moving with mean velocity v, in one second it collides 
with all the particles (at rest) confined in a cylinder of height v and base area 
Oy described by the light particle. The appearance of o, instead of ø is equiv- 
alent to taking into account the fact that not all the collisions are effective. 
Thus, for example, collisions with a scattering angle œ = 0 change the direc- 
tion of flight only a small amount and are less likely to make the particle go 
out of the cylinder than collisions with a= 7. The total number of collisions 
is correspondingly equal to Nvo. The path traversed on the average between 
two consecutive collisions, i.e. the mean free path, is equal to 





= ee 
he Rpg Nan (24.15) 


It is useful to note the dependence of A, on the pressure in the gas. Since 
p= NKT, we have 


Ny = KT poy, . (24.16) 


If, in particular, the colliding particles are considered to be rigid spheres and 
Oy is expressed by the formula o,,= 7(r, +1r)?, where r; and ry are the radii 
of the particles, then 


kT 1 


Sep mr, +17)? E 


§25 THERMAL DIFFUSION IN GASES 133 


Correspondingly, the coefficient of diffusion is 


-T)3\ + 
D=} = | = eae ) f (24.17) 
Sip Hr )ap A K 





Thus the coefficient of diffusion depends on the properties of the diffusing 
particles (their mass and size) as well as on the radius of the molecules of the 
basic gas. It increases with temperature, ~ T?, and is inversely proportional to 
the pressure. 

Let us now find the conditions of applicability of the approximate solu- 
tion found. The solution is applicable if expansion (24.3) converges suffi- 
ciently rapidly. For this it is necessary that in its turn the following condition 
be fulfilled: 


lufil Sfo- 
or, if the value of f} is substituted from (24.5), 


1 of 


Auf ax Śl 


This last inequality means that the equilibrium function must change suffi- 
ciently little over a distance equal to the mean free path. 


§25. Thermal diffusion in gases 


Above, in considering the motion of the light gas particles, we assumed the 
gas temperature to be constant. We shall now give up this assumption and 
shall consider the more general case where there is a variation in concentra- 
tion of the diffusing gas together with a temperature variation along the x- 
axis. 

We shall again find the flux of the light gas particles along the x-axis, 
making use of formula (24.10). However we cannot now take out of the 
spatial differentiation sign the quantities depending on gas temperature, since 
the temperature itself varies from point to point. Hence we rewrite formula 
(24.10) in the form 


os cos20 d /v 
Y N ax (z) dp. Co) 


ied | 


" 
Iy 


pr 


134 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


We have brought under the spatial differentiation sign the quantities o,, and v 
which are functions of the true velocity of the particles (but not of the mean 
velocity of the particles, which depends on the temperature). 

Carrying out the integration over angles, analogous to formula (24.11), we 


find 


Me a T T 
i=- 3N ax Sog PP an ax” (e) ar 


In order to pass to the usual notation in which /, is expressed as a function of 
the light gas particle concentration, we write 


N=p/kT, 

where p is the gas pressure, constant throughout the mixture. We then have 
j ~~ @ eB (=) = 
me 3 dx p \o, 


So Ll d n O\\ Wh © (Owe 
3 d NET \c,)" 3 & T a, 





__il (2) de _ Te a [fil (OM) ce 
oi) a) ae o GU p (2) dx ` 2538) 
Introducing the diffusion coefficient, we find 

= py ne 

y= —DN = Teis No] aa? (25.4) 


The formula for the particle flux is usually written in the form 
jp=—DN (E+ an) 3 (25.5) 
Comparison of formulae (25.4) and (25.5) gives 


kp=e a| x E (25.6) 


The quantity ky is called the coefficient of thermal diffusion (see below). 


§25 THERMAL DIFFUSION IN GASES 135 


To elucidate the meaning of the result obtained, we shall set de/dx = 0, i.e. 
shall consider the case where the concentration of admixture is constant in 
the container of the gas. Then 


ips -DNkr SZ. (25.7) 


We see that in the presence of a temperature gradient in a gas mixture of con- 
stant composition there arises a motion of the particles of the admixture with 
respect to the basic gas. This phenomenon is called thermal diffusion or 
thermodiffusion. The value of the thermodiffusion flux is determined by the 
magnitude of the temperature gradient and by the value of quantity ky. 

The presence of a thermodiffusion flow, i.e. of relative motion of the par- 
ticles of an admixture in a definite direction, will cause a change in the com- 
position of the mixture i.e. the appearance of a concentration gradient 
de/dx # 0. This latter effect will in its turn lead to an appearance of a flux of 
the particles of the admixture, which will reduce the accumulation of the ad- 
mixture due to thermal diffusion. As a result, a state will be established such 
that the particle flux due to thermal diffusion is completely compensated for 
by the flux due to ordinary diffusion. In this case the total flux of admixture 
particles with respect to the basic gas will be equal to zero, and formula 
(25.5) gives 


oe de dT\ _ 
i= -DN ($+ 4 $2) =0, 


dx 
so that 
de dT 
Avie a las. (25.8) 


In a non-isothermal gas mixture, a stationary concentration gradient de- 
fined by formula (25.8) is established. In the general case, irrespective of 
whether or not such a steady state is reached, a temperature gradient gives 
rise to a concentration gradient in the gas mixture. 

We have not, so far, pointed out in which direction the light gas particles 
will move; the direction of increasing or decreasing temperature. From (25.7) 
it is clear that the direction of flow is defined by the sign of the coefficient of 
thermal diffusion ky, since D is an essentially positive quantity. The sign of 
ky depends on that of the derivative d/d7[(v/o,,)/7]. It is impossible to in- 





136 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


dicate the sign of this derivative for the general case. It depends on the actual 
law of interaction between the molecules of the admixture and the basic gas. 

The phenomenon of thermal diffusion is of great practical importance. It 
is used for the separation of gas mixtures, in particular of mixtures of iso- 
topes. In a container with a gas mixture let one of the walls be maintained at 
temperature T}, and another at a temperature 7, > T}. Then a thermodif- 
fusion flow will arise in the container. As a rule, the light gas particles move 
against the heat flow, i.e. towards the warmer wall. If a mixture of constant 
composition was initially in the container, then as a result of thermal diffu- 
sion, it will be enriched in one component, for example the light one, at the 
warmer wall. By extracting the particles of the light component here, one can 
maintain a stationary thermodiffusion flow and carry out a separation of the 
light and heavy components of the mixture. 


§26. The dispersion of sound 
One of the relatively new applications of the kinetic theory of gases is to 
find the law of sound dispersion in gases. Let us consider an equilibrium 
monatomic gas in which a plane sound wave is propagating. We choose the 
direction of propagation to be the x-axis. It is natural to try to find the per- 
turbation of all the quantities characterizing the state of the gas, in particular, 
the perturbation of the distribution function in the form 
y= Fv) ele) | (26.1) 


The equation for the perturbed distribution function is of the form of 
(21.18). Substituting (26.1) into (21.32) gives 


iwF(v) — ivk F(v) = I(¥p) - (26.2) 
We expand F(v) in a series in terms of the eigenfunctions of eq. (21.33) 
FL) = Vay, 
We then have 
Dficoase — irva] = Ayo. 0). (26.3) 


Multiplying (26.3) by yg and integrating, we arrive at the system of linear 


§ 26 THE DISPERSION OF SOUND 137 


algebraic equations 


iwap + ik 27 Bip oj — poy = 0, (26.4) 
where 
ae Fev av 
{fs Sma ay ONE 
Fg av 


For the system (26.4) to have non-zero solutions, it is necessary that its deter- 
minant reduce to zero. This gives 


liw + Ay) ik — iKBjxll = 0 . (26.5) 


Determinant (26.5) is, in general, of infinitely high order. To define the law 
of dispersion a(x), it is necessary to evaluate this determinant and to find its 
roots. 

If we confine ourselves to the first known eigenfunctions of the homogen- 
eous equation (21.32), then we have 


Yi=1, EV, 93=4m?, 


In this case the first three columns and three rows are to be retained in the 
determinant. 

A simple calculation leads to the formula obtained in the hydrodynamic 
approximation 


ae aay ’ 
3m 


The velocity of sound c turns out to be constant 


KON (a 
CSS Wa) a 
K 3m 


Taking into account eigenfunctions and eigencalues of higher order, one can 
find the law of dispersion as a function of frequency. 





138 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


It turns out that the velocity of sound in this case increases with frequen- 
cy, and that at the same time absorption of sound occurs. The absorption 
coefficient rapidly increases with decreasing wavelength. As the sound wave- 
length 1/k approaches the mean free path, the absorption coefficient in- 
creases up to the value 1/k. This means that the propagation of sound as a 
periodic perturbation is stopped when k ~ 1/2. 


§27. The linearized Boltzmann equation for quasi-gaseous systems 


The application of Boltzmann’s equation is not confined to the case of 
ideal gases. On the contrary, its most important applications are not asso- 
ciated at all with the consideration of ideal gases. There are a large number of 
important cases where the kinetic behaviour of a system is similar to that of a 
gas. 

In the most general form the properties of such systems, which we shall 
call quasi-gaseous systems, can be formulated as follows. Let a system of 
mutually non-interacting light particles (particles of the first kind) be placed 
in a medium formed by mutually interacting heavy particles (particles of the 
second kind). Between particles of the two kinds there is a certain interaction 
having the character of pair collisions. The system of particles of the first 
kind can be described by a certain one-particle distribution function f(r, v, t), 
since there is no interaction between these particles. Then, in the general case, 
one can write for the change of the distribution function 


OP, DLO Ayaan fo (£) 
arm ye Yp f dt] con’ (27.1) 


where (df/d?),,.q describes the change in the distribution function due to the 
pair interaction; collisions between particles of the first and second kinds. 

Further, we assume that the collisions can be considered completely elas- 
tic, and that the state of particles of the second kind does not change in col- 


lisions. Then we can write 


Here, as distinct from (14.10), v represents the velocity of the light particle, 
and N is the number of heavy particles per unit volume. As a result, we ob- 


§27 LINEARIZED BOLTZMANN'S EQUATION 139 
tain for the distribution function of the light particles the equation 


ð ð > 
ZE m ot F- f= =N fovlf,- fdl, (27.3) 


which represents a linear integro-differential Boltzmann’s equation. A partic- 
ular case of it is eq. (24.2) describing the diffusion of an admixture of light 
gas particles. 

In the chapters devoted to plasma theory and solid-state theory we shall 
discuss in detail modern concepts about the behaviour of electrons in plasmas 
and in solid bodies. Bearing in mind subsequent applications, we shall assume 
the external forces acting upon particles of the first kind to be Lorentz 
forces: 


F=e(E+1 [vx H) 


We shall consider the system to be in a non-equilibrium but stationary state, 
and the temperature and the chemical potential to vary in space, so that 
T= T(r), u = u(r). 

Then eq. (27.3) takes the form 


P.S se(E+l [vx Hi) NJE o(f;-NdQ. (27.4) 


m ðr 


To find the solution of this stationary linear Boltzmann equation, we shall as- 
sume that the local equilibrium distribution fe, T, u) is defined, where € is 
the energy of the particle, and u and T are local values of the temperature and 
chemical potential. The local equilibrium distribution f can be a Maxwell 
distribution, a Fermi distribution, or a Bose distribution. Assuming the depar- 
ture from an equilibrium state to be sufficiently small, one can try to find the 
solution of (27.4) in the form 


f=fO+F@,»), 


where |f | <f. Physically this means that all the fields acting on the par- 
ticles as well as all the temperature gradients and concentration gradients are 
small. As distinct from f), the part f’ of the distribution function does not 
have spherical symmetry in momentum space. 

Conservation of the total number of particles requires that the following 





140 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


condition be fulfilled: 


{f@.n)dp=0. (27.5) 
In the general case one can write 
SO = f9((e—W/KT) . (27.6) 


Substituting (37.5) into (27.4), we shall retain only quantities of higher order. 
Obviously, we have 


P. af 


m Sa m ðr ` 


We transform the derivative 








af _ af ((e—w/kT)_ yo yo ( 1 )22 1 ðu 
or ar KE Ar NET) oe kT or’ 
so that 
p af _p af 1 _ vu 
aa a g IG aD ET CIT 
Further, we have 
af fV de _ep-E i 
dp Eea a a NIGEL) O73) 
e af p, e = 
me na (PX Nl ao mc ð m [px H] + me - [px H] 
-elp X H] | af’ 
“me ap’ Ci?) 


since the first term identically reduces to zero. 


§27 LINEARIZED BOLTZMANN'S EQUATION 141 


Finally, for the collision integral we find 


I=N [E off-Flaa, 


m 


since the equilibrium distribution reduces / to zero. 
Substituting (27.7), (27.8) and (27.9) into (27.4), we find finally 


P of , Af p, 





ke H) V : YH +e] = 


m or ðe m kT kT 
=- £ [px H] -Lan for i-fl. (27.10) 
me ap m1 à i 


Because of the presence of additional terms on the right-hand side, the solu- 
tion of eq. (27.9) in a magnetic field has some singularities. We shall consider 
this in ch.3, and in the meanwhile we set H = 0. Then eq. (27.10) goes over 
into a linear integral equation which is conveniently written in the reduced 
form 


Bi 


m 


p= Ne 
A-p=N® fof 








X a) ih —~f']da. 
We shall try to find its solution as in §24 in the form 
f =—apAcosé, 
IEN 
where 0 = (p, A), and & is a certain constant. We then have 
SD p = 
pA cos@ aP Af o (Bl) [cos 8; — cos 0] dQ, 


Zon 
where 0' = (p,, A). The value of cosô' can be expressed in terms of cos@ and 
cos @ by means of a formula of spherical trigonometry (Vol. 1, (1.7)). Using 
this expression, we find 


fod p/m|, œ) (cos 0 cos œ — cos 80) d Q= — cos 0 oy- 


aint 





142 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Chin: 


Thus we have 


a=1/Nvo,=7. 
Finally, we obtain for the distribution function 
(0) 
= f(0)_ "IDI af" 2 a) 
ia Tm Oe. (e = WY ep Tr teE)]. (27.11) 


We note, first of all, that the collision integral in the same approximation can 
be written in the form 


NI =N- fOr. 


This last expression shows at once that 7 represents none other than the relax- 
ation time (compare with (23.6)). 

The violation of spherical symmetry of the distribution function in mo- 
mentum space turns out to be proportional to the value of cos @. As a matter 
of fact, the representation of the distribution function in the form (27.11) 
expresses its expansion in a series in terms of Legendre polynomials, in which 
the first term of the expansion is retained. It is easily seen that the subsidiary 
condition (27.5) on the distribution function is automatically satisfied. 

Violation of spherical symmetry of the distribution function gives rise to a 
mean particle flux whose density is equal to 


0) 
izfr- fr |p, (e=) V pr YE + eE vdp. (27.12) 


The flux j, according to (21.8), represents the first moment of the distribu- 
tion. 

It is useful to note that the flux j depends on the properties of the system 
of particles only through the derivative of the equilibrium distribution func- 
tion (27.6) with respect to energy. It is clear that if the energy distribution 
were uniform, the particle flux would reduce to zero. 

In ch. 5 we shall make use of the expressions obtained for fand j. 


§28. The solution of Boltzmann’s equation for a quasi-gaseous system in an 
external force field 


In what follows we shall use the general solution of the linearized Boltz- 
mann equation in an external force field. 





§28 SOLUTION OF BOLTZMANN'S EQUATION 143 


Let us consider the equation 


Of aOR ar 


Z a t— e 28.1 
ðt or m ov ( ) 
We assume the external force field to be weak and seek the solution of (28.1) 
in the form of a series of successive approximations. Confining ourselves to 
the zero and first order terms of the expansion, we can write 


f=fOW, D+F, À, (28.2) 


where f corresponds to the absence ofa field, and f” is proportional to the 
field. 

In order to avoid cumbersome formulae, we first restrict ourselves to a 
spatially uniform system, i.e. we set 0f/dr = 0. 

Then we have 


af nF af 
Or If )= = Waning (28.3) 


where J(f’) is the value of the collision operator in which the function f’ is 
taken as the distribution function. 
Besides eq. (28.3), we introduce Green’s equation 


oi (W) = 5(v —v')5(t—1'), (28.4) 


where W is the Green’s function satisfying the conditions 


0 PEN o 
w= ; 
oo Rais 


(28.5) 
aw at ' 
Fr I(W)=0 ite 


The meaning of the Green’s function is very simple: it represents the proba- 
bility that a particle which at the instant of time ¢’ had a velocity v' will at 
the instant of time ¢ have a velocity v. 

The solution of eq. (28.3) can be expressed in terms of Green’s function 





144 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


by the usual formula 
ple ae 
fv, D=- fav far s AO vv, EE 


F oe 


m . 





fal dt’ Wy, v’, t- t')— (28.6) 


Here we have made use of the first property of the Green’s function (28.5). 
Formula (28.6) gives the general relation between the change of the distribu- 
tion function and the transition probability W. Knowing f(v, £t), one can find 
the particle flux by the formula 


j= fias f vav fav f ar E. ar? wy, AE tN. (28.7) 


Analogous expressions can be obtained for a spatially non-uniform but 
stationary system. Then instead of (28.3) and (28.4) we have (for simplicity, 
in the one-dimensional case): 


of __F of 
yl u i o, (28.8) 
Biles wots) = ka = 6(x—x')6(v—v’), (28.9) 


where W represents the probability that a particle which was at point x’ and 
had velocity v' will pass to point x and acquire velocity v in unit time. 
The solution of (28.8) has the form 


ren=-fw KE vie me) dx dv! (28.10) 


In particular, if one sets /= W/r then (28.9) assumes the form 


ow 


ae oe = 60 - x')5(v—v').. 


§29 KINETIC EQUATION FOR POLYATOMIC GASES 145 
Hence it follows that 


eX x >x’, 
W= 


(0, aai 


and (28.10) can be written in the form 


r (0) 5 
fv, x)= sz of ') en X 1m dx". (28.11) 


m ðv 





The integration is carried out over a segment of trajectory passing through 
point x in the direction of the field, for which x” <x (we have made the 
substitution (x — x’) > x”). 

The meaning of the exponential factor becomes very simple if it is written 
in the form e™¥"/A, where À is the mean free path length. 

Thus the linearized Boltzmann equation in an external field allows a solu- 
tion expressed in the most general form in terms of the transition probability 
W. In what follows we shall use this form of solution. 


§29. The kinetic equation for polyatomic gases 


So far we have confined ourselves to the case where gas molecules have 
only translational degrees of freedom. This is a good approximation for the 
treatment of monatomic gases. However, in the more important case of poly- 
atomic gases the applicability of this approximation is not a priori obvious. It 
turns out that it is possible to formulate, in a most general way, a kinetic 
equation for diatomic (linear) molecules or molecules of the type of a sym- 
metric top*. Such molecules have rotational degrees of freedom. Rotational 
levels are always strongly excited (see §44 of Part III), so that they can be 
considered classically. 

It can be assumed that oscillatory degrees of freedom are not excited at 
temperatures which are not too high. Thus the motion of the molecule is 
defined by three translational and two rotational degrees of freedom. The 
rotational state of the molecule can be characterized by two generalized coor- 
dinates (for example by two angles) and by two generalized momenta corres- 
ponding to these coordinates. 


* In this section we follow the studies of Y. Kagan et al., Soviet Physics JETP 14 
(1962) 1096, 604; 24 (1967) 1272. 





{ 
| 


ne 


146 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


However, it is more convenient to characterize rotational motion by four 
quantities; three momentum components M; (i= 1,2,3) and the angle y 
characterizing the orientation of the molecule in a plane perpendicular to the 
vector M;. In these variables the kinetic equation assumes the form 


DP coil Of n Ol nig Oe 
A O a? OV Um OM @9-1) 





The collision integral is of the form 
I= [(pf,w' —ffw) dv, d Q dM, dM dy = 
= [w(fofs— ff) dv, d 2 dM, dMdy , (29.2) 


where the transition probabilities w’ and w represent the probabilities of 


direct and inverse transitions. 
By virtue of the principle of detailed balance w’ = w, so that w does not 


change when M is replaced by (— M). 
Here 


dM = M dM dQ 


The solid angle elements dQ, and dQ, are defined by the orientation of the 
vectors V and M. 

The quantity W represents the velocity of rotation of a molecule in a plane 
perpendicular to the vector M. In order of magnitude 1/ ~ 10-13 sec, ice. is 
of the order of the collision time. The time 1/ġ is very short in comparison 
with the time lapse between two consecutive collisions, 7. In order of magni- 
tude we have 


of Af. Af 
Ae et (29.3) 


Therefore it can be assumed that the term ý 2//dw is the largest term of eq. 
(29.1). In the first approximation, (29.1) can be written as 


v s =0. (29.4) 


This means that the distribution function can be considered to be indepen- 


§29 KINETIC EQUATION FOR POLYATOMIC GASES 147 


dent of the angle y, i.e. one can set 








of _ 
ava 0. (29.5) 
Then (29.1) takes the form 
of df LO a Gin 
of Or m av. M- aM © Ae Ces) 
The local equilibrium distribution function f(v, M) is of the form 
o= (2)? 1 mv2poer pM? pwr 2 
A (ir) 4nJkT © $ : eo 


where, as before, V=v—u is the velocity of translational motion of the 
molecule with respect to the gas as a whole, and J is the moment of inertia 
corresponding to M. 

Let us consider the thermal conductivity of a gas. Calculations are carried 
out according to the same scheme as for a monatomic gas. Setting u = 0, so 
that 





fO = (ar) 2 ggr "IAT eM? AIK , (29.8) 


we write the distribution function in the presence of a temperature gradient 
in the form 


=f (14537). (29.9) 


Carrying out calculations analogous to those performed in §22, we have in- 
stead of (22.27) 


afO _ aT 1 (mt M? ) po 
wa b END KIN AIA 








Correspondingly, instead of (22.30) we obtain 


v (7 _ mv? OE 
T\2 2kT 2WkT 


= [AO wits +3- —¢] dv, d2 aM; dM. (29.10) 


si 
Et 


Mi 


is 
iby 


148 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


As distinct from eq. (22.30) containing only one vector V specifying a definite 
direction, for eq. (29.10) one can construct three vectors: v, M[M-v] and 
[M X v] specifying three different directions in space (the quantity M is a 
pseudo-vector). Hence instead of (22.31) one can write for the vector € in the 
most general form 


¢ =vat+ M(M- v)6+ [MX v]y, (29.11) 


where a, 6 and y are scalar functions. 

However, it is easily seen that this last term must be equal to zero: y = 0. 
Indeed, the left-hand side of eq. (29.10) is invariant under the replacement of 
M by —M. On the other hand, the vector M X v changes sign under this re- 
placement, and the integral term also contains the quantities (© and M which 
are invariant under this replacement. Hence if in (29.10) we retained the last 
term of the expansion, the left-hand and right-hand sides of the equation 
would transform under the replacement of M by —M according to different 
laws. 

The scalar functions a and ĝ depend on all the scalar arguments which can 
be constructed from the quantities involved in (29.10). Three such scalar 
quantities can be constructed: v2, M?, (M- v)?. Thus, finally, 


C= va(u2, M2, (M- v)?) + MMM- v) B(v2, M2, (M-v)?). (29.12) 


The scalar functions œ and 6 must satisfy the integral equation (29.10) and 
subsidiary conditions (22.12)—(22.14). The heat flow is defined by the for- 


mula 


m) ð 
= Dn A “Wain py NA 5 
q=N f (tm +5] fOodv K z (29.13) 


To find concrete expressions for the functions a and ĝ and, hence, to calcu- 
late the coefficient of thermal conductivity, it is necessary to make assump- 
tions about the function w. 

Such calculations were carried out in the above quoted studies of Kagan 
et al. in the approximation where collisions can be considered to be elastic 
and molecules to be rigid spherocylinders (cylinders bounded at the top and 
bottom by hemispheres). They are too cumbersome to be given here. As a 
result, for the thermal conductivity one obtains the value 


k = 1.6mk (Z) (2 (4) ae (<) +0.3}] Be (29.14) 


§29 KINETIC EQUATION FOR POLYATOMIC GASES 149 


where / is the height of the cylinder, and a is the radius of the hemispheres. 
Other kinetic coefficients are calculated in an analogous way. The above 
reasoning is of rather methodological interest, since it shows how one may 
seek the solutions of integral equation (29.10) for polyatomic molecules 
possessing rotational degrees of freedom. 

A more interesting qualitative effect which manifests itself in diatomic 
molecules is the dependence of the kinetic coefficients on external fields. 

The thermal conductivity and viscosity of paramagnetic gases turn out to 
depend on the external magnetic field (the Senftleben effect), Analogous 
phenomena are observed in gases of polar molecules placed in a static electric 
field. The two phenomena are of the same physical nature: in the presence of 
magnetic or dipole moments the molecules are oriented by the field. In the 
Kinetic equation one has to retain the term corresponding to the dependence 
of the distribution function on the moment. As a result of this an additional 
heat flow or momentum flow arises. 

Let us consider, for example, the case of the thermal conductivity of a 
paramagnetic gas in a magnetic field. The kinetic equation is of the form 





P(e mv? M? Jz 


w 
= (4) Jed co 
"eT \2 OKT ZIET Mø I). (29.15) 


Here M can be written in quasi-classical approximation as 
= [uX H] =“ (Mx H] =y[MX H], 


where p is the magnetic moment, Ho is the Bohr magneton, and g is the 
gyromagnetic ratio. Hence we have 





f% (Z- mu? M? 


oT e a 
=T (2 2AT - fer) + y[M x H]-5 I(y) . (29.16) 


One can again try to find the solution of integral equation (29.16) in the 
form 


oT 
AT k axg" 








ere e M 


ao ar 


150 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


For the vector ¢, we obtain 


K = 1(5,) - (29.17) 


FO + yf [M X ee i 





BLE (z mv? =) 
TATE See 

In the presence of a magnetic field, ¢, is defined by the vectors v, M and 
H. Calculations analogous to those given above, but even more cumbersome, 
lead to an expression for ¢,. The expression for the thermal conductivity is 
found from formula (29.13). In complete agreement with experiment it turns 
out that the change in the coeffcient of thermal conductivity is 


Ak =K —(k)y=9 = F(H/p) , (29.18) 


where F is a certain function of the argument H/p, and p is the pressure of 
the gas. The temperature dependence of this quantity is determined by the 
specific form of the cross section for collisions of the molecules. 

Finally, it turns out that the thermal conductivity in a magnetic field is 
anisotropic. The ratio of thermal conductivities for VT || H and V7 LH in 


strong magnetic fields is 


Ak 
MA for H > œ 
Ak, i 


and ceases to depend on any parameters. 
In the case of gases of non-linear molecules possessing oscillatory as well as 


rotational degrees of freedom, the formulation of the kinetic equation turns 
out to be difficult. This is associated with the possibility of transitions be- 
tween different vibrational and rotational states which arise in collisions. The 
Boltzmann equation can by no means be formulated and solved without 
making certain assumptions about the character of energy transitions from 
translational to vibrational degrees of freedom as well as on the character (in 
particular on the multiplicity of the expression) of the latter. 

In the case of non-degenerate states it is possible to write Boltzmann’s 
equation for internal degrees of freedom. However, its solution can actually 
be found only in limiting cases where the energy transition mentioned above 
proceeds without difficulty or, on the contrary, is rather hindered. 

A hindered energy transition between translational and internal degrees of 
freedom gives rise to a specific relaxation process. This in its turn essentially 
affects the kinetic properties of the gas, in particular the values of transport 


coefficients. 


§30 THE MODERATION OF FAST NEUTRONS 151 


In its very meaning, Boltzmann’s equation applies to rarefied systems with 
pair interaction between the particles. Hence Boltzmann’s kinetic equation 
allows one to consider the behaviour of only a rather limited range of systems. 

Nevertheless, Boltzmann’s kinetic equation is of great importance for 
modern physical kinetics. It makes possible the deduction of a number of 
general theoretical conclusions on the character of irreversible processes, al- 
lows the formulation of general transport equations, and introduces the most 
important characteristics of the behaviour of a system in an irreversible pro- 
cess, the kinetic coefficients. 

For those systems to which Boltzmann’s kinetic equation is applicable, 
one can obtain an exact expression for the transport equation, relaxation 
times and transport coefficients and, under certain model assumptions, their 
numerical values. 

Of no less importance is the fact that in a number of physical systems and, 
above all, in plasmas and solid bodies it is possible to describe the behaviour 
of the system in the form of the motion of a system of quasiparticles whose 
properties are close to those of an ideal gas. For this reason, the quantum 
generalization of Boltzmann’s kinetic equation plays a very important role in 
solid-state theory. 

Some other examples of the application of Boltzmann’s equation to the 
solution of kinetic problems will be given in subsequent sections. 


§30. The moderation of fast neutrons 


One of the most fully investigated problems of kinetics is that of the mo- 
tion of neutral particles or radiation in matter. 

Let particles originate in a region which we shall call the source, and then 
move in matter, undergoing scattering or absorption. 

In the kinetic theory of gases such a statement of the problem appears to 
be somewhat artificial. However, it is encountered very often. As an example 
we point out, first of all, the problem of the spatial distribution of radiation. 
If certain radiation sources emit a particular frequency spectrum S(w, r), 
then in passing through matter this spectrum will change. Absorption and 
scattering will change the intensity as well as the angular and frequency dis- 
tributions of the radiation. 

Another case of the interaction of neutral particles with matter, which has 
been investigated in great detail, is the passage of neutrons through matter. 
As is well known, neutrons are usually obtained as fast particles of energies of 
the order of several hundred thousand or several million eV. On passing into 


ar ot. 
SS 





152 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


matter and undergoing collisions with nuclei, the neutrons are moderated. In 
inelastic collisions their energy decreases substantially at each scattering. If, 
however, the energy of the neutrons is lower than the first excited level of 
the nucleus, then no inelastic collisions occur. Subsequent deceleration is as- 
sociated with elastic collisions between the neutrons and nuclei. 

Although the two processes considered are of completely different physi- 
cal nature, their formal treatment turns out to be rather similar. Namely, it 
turns out that the distribution of neutrons and photons in configuration 
space and their momentum distributions are described by identical equations. 
This is associated with the fact that in both processes one can disregard the 
interaction of the particles with each other, and that the interaction with 
nuclei (for neutrons) or with atoms (for photons) has the character of a 
short-range interaction. 

In what follows we shall for concreteness speak of the motion of a neu- 
tron flux in matter. 

Let a system of neutrons be characterized by the distribution function 


S(t, p, 1), where 
dn = f(r, p, t) dpdV 


is the number of neutrons of given momentum p in volume element dV at the 
instant of time ¢. 

For the distribution function one can write the kinetic equation character- 
izing its variation in configuration space and in momentum space with time. 
We shall write it in the form of a balance of the number of particles per sec- 
ond coming into and going out of a phase-volume element dr. 

That is, we have 


Var = (2) me + q(t, | dr, (30.1) 


where q(r, £) is the strength of the nuetron source, i.e. g(r, t) is the number 
of neutrons arising per cm? per second at the point r. 

As a result of collisions, neutrons go out of and come in to the given 
phase-space element. For neutron energies sufficiently large in comparison 
with thermal velocities, one can disregard the motion of the nuclei and con- 
sider them to be at rest. 

Each collision of a neutron of given momentum with a nucleus takes it out 


§30 THE MODERATION OF FAST NEUTRONS 153 


of the volume dI°. We write the number of these collisions as 


MOA E 3 
(3) > (30.2) 


where 7 is the total free time before collision. Let us write the latter in the 
form 


where /,, /, and le are respectively the total mean free path, the mean free 
path before capture and the mean free path before scattering. 

Because of collisions, neutrons which had a different (larger in absolute 
value) momentum and were elastically scattered by nuclei go into the element 
dr: 


ð ' 
2 son 0 VAP: PAT, pi, AP, (30.3) 


where o(p, Pp) is the cross section for scattering from an initial momentum p 
to a final momentum p}, and ng is the number of scattering centres (nuclei) 
per unit volume. 

Taking into account (30.2) and (30.3), Boltzmann’s equation (30.1) can 
be written in the form 


=—+ vV: aL- no frp, P,) fi, py, dpi +q. (30.4) 


Eq. (30.4) is linear, since the collisions of neutrons with each other are dis- 
regarded. 

In neutron physics there are two basic problems: (1) to find the neutron 
energy distribution (if the energy of the neutrons coming from the sources is 
known), and (2) to find the spatial distribution of the energy. 

We shall begin by considering the first problem. Since we shall not be inter- 
ested in the spatial distribution of neutrons, we integrate eq. (30.4) över all 
space and write 


fre, p, )aV=NP,), fale, )dV=Q. (30.5) 





154 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Clearly, we then obtain 


aN N ' 

Sr tr ro fv'oM(p;,) 4p; +Q. (30.6) 
The integral f (0f/dr) dV over all space reduces to zero. In order not to com- 
plicate the calculation we shall confine ourselves to the case of a source which 
emits neutrons of energy £p steadily. In this case Q does not depend on the 
time, Q = Q)5(£)/47. Then (30.6) becomes the stationary equation 





= no fv'o(p, Pi) MP) dP, +— ge Bee (30.7) 
It is more convenient to write (30.7) in the form 
VP) ={ VPD WP, Py) : ape, (30.8) 
where the auxiliary function is 
W(P) = M(p)/7 , (30.9) 


and w(p, p4) is the transition probability normalized to unity. 


fp, py) dp; = 1 (30.10) 


It is obvious that 


wep, P1 )/lsc = ngo(p, pı) 9 


It is now necessary to write the explicit value for the transition probability 
w(p, pı). Since neutrons interact with nuclei only at very small distances, 
their elastic scattering obeys the laws for elastic collisions and is isotropic. 
Hence 


P pP œ a8) 


w(p, p1) = y) 6 o= Sm EM (30.11) 


where the argument of the 5-function expresses the energy conservation law 
in a collision with a nucleus at rest, and y(p) is a certain function of the neu- 


§30 THE MODERATION OF FAST NEUTRONS 155 
tron energy p2/2m. M denotes the mass of the nucleus. The value of y(p) can 


be found from condition (30.10). 
Carrying out the integration, we obtain 


2 2 
Eom eas DEP T 


1 \ er} ile pp, cos®\ 2 5 z 
= fa (3v 2(5 - i) 2P (+i) M pjdp, sin 0 dô dy = 


Pmax 
d ni See 2 
f Py SP) >= Pmax P min) : 


Pmin 





a 27M 
p 


Here Pmin aNd Pmax are the minimum and maximum momenta before colli- 
sion, which after collision turn into the momentum p. 
According to the elastic collision laws (see (43.33) of Part I) 


_M-m a 
Pmin~ f+ m’ Pmax~ P- 


Therefore, finally, 


4nmM 
JPPD dp; = cO datas 


Hence y= (M+ m)2/4nmM2p and the normalized scattering probability 
assumes the form 





2 2 2 
_1 (M+m)\? 1 P-P (p; —P) ) 
w(p, P= 2 ( M ) Lef 2m 2M - (30.12) 


The problem of integrating eq. (30.8) for the value of w(p, pı) given by 
formula (30.12) is very complex. We shall therefore confine ourselves to two 
limiting cases. 

Let us first consider the scattering of neutrons on hydrogen nuclei, i.e. the 
case m = M. Then 





Dee Da _ yw 
1 E p’—(P,—P) ). Gorey 


W(P: P1) = Tipp © 2M 





156 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Substituting (30.13) into (30.8) we note, first of all, that in view of the iso- 
tropy of the scattering, the function w does not depend on angles and de- 
pends only on the absolute value of the momentum, so that (30.8) assumes 


the form 


p x 

1(pı) PdP} Q5(Eo) 
(py=2 fy 30.14 
wip) J YT) pe + Gn OHS) 





or, passing to a new variable, the energy, one can write (30.14) in the form 


E I(E) dE, Q5(Eo) 
v= f VED E + (30.15) 
0 





This equation can be solved in an elementary way. Differentiating gives 


E 
dy _ E Ba 1 WE) , 20 E o) +5'(E,)| 


dé LI,.(E) PEAT BE: 





The solution of this linear equation is 


_ Qo !(Eo) E I(E) dE; 
Thn ED |- J IE) F - (30.16) 


For the energy distribution function, it follows from (30.9) that 


20 'Eo) WE) {- ii I(E) ze) (30.17) 


“An lE) v h, E D) Ei 


If there is no absorption /= /, = l and (30.17) is simplified 
Q 11 
NE) = Tse (30.18) 
Eq. (30.8) is also substantially simplified in the other limiting case, neutron 


scattering on heavy nuclei. In this case the energy transfer in each collision is 
small. 





§30 THE MODERATION OF FAST NEUTRONS 157 

To simplify the calculations we shall confine ourselves to the case where 
the absorption of neutrons can be disregarded in the process of their modera- 
tion in a medium. Then the general case of non-monochromatic sources can 


be considered. In the absence of absorption /, = /,. and eq. (30.8) assumes the 
form 


vp) = f WPD WP, P)) dp, + QE)/4r . (30.19) 
Since we are interested only in the energy spectrum of the moderated neu- 
trons, (30.19) can be integrated over all angles. Then it turns out to be con- 


venient to pass from the absolute value of momentum to a new independent 
variable* 


u=In(E,/E) = 2 In (pgp) . (30.20) 
The grounds for the choice of such a variable will be seen in further calcula- 


tions. 
We define the new unknown function (uw) by the relation 


vu) =f Wp) dQ. (30.21) 
We set u = (Pp, P)) and 

wdp, =n(u, uw) dudQ . 
A simple transformation gives 


nu, u) = nlu, Pj, P) = 





_(M +m)? = Mtm 1,,M—m tu) 
8nMn ° © (Cf am © Sone SIE: 


Integrating (30.19) over all angles, we find 
u 
Wolu) = f du’ Yolu’) f n(u-u', u) dQ. (30.22) 
0 


* A.Akhiezer and I.Pomeranchuk, Nekotorye voprosy teorii yadra (Some problems 
of nuclear theory) (Gostekhizdat, Moscow, 1950). 








158 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


The internal integral with the 6-function is easily calculated if it is taken into 
account that the neutron energy after the elastic collision, Æ’, lies in the inter- 
val (see § 43 of Part I) 


E 


max ~ M+m 


2 
E>F > E) E= Erin- 
Hence introducing the quantity 
Uy =1n (EJE min) = 2 In [(M + m)/(M — m)] 


and integrating over angles in the ô-function, we find 


ton eu) foru< uy > 
A ; 4mM 
no(u—u') = fre- u', u) dQ = (30.23) 
(0) for u > Uy - 
Substituting (30.23) into (30.22), we have 
u 
vou) = f du’ ¥o(u')no(u—u') +O. (30.24) 
0 


Eq. (30.23) can be solved in closed form by means of a Laplace transforma- 
tion. Namely, we multiply (30.24) by e™7" and integrate over all ws in the 
range 0 < u < %, We then find 


oo co u oo 
J Wo(u) e~ 2" du = f Gan du f du’ You’) nou — u')+ f e~74 O(u) du. 
0 0 0 0 
(30.25) 
Changing the order of integration in the double integral, we obtain 


co 


f du’ Wolu’) ezai" f e=zu-u)n(u — u’) du’ = 
0 


u 


= fau’ you'e=zi f e=7unfu) du (80.25') 
0 


§30 THE MODERATION OF FAST NEUTRONS 159 


Denoting the Laplace transforms by 


e= f voee du, (30.26) 
0 

7(z) = f edu, (30.27) 
0 

O(z) = f Quer" au (30.28) 
0 


and taking into account (30.25'), we finally obtain in place of (30.25) 
Ve) = Ee) + OE). (30.29) 
Whence for the transform y we find 


Je) = 22 


AO (30.30) 


Inverting the Laplace transformation, we find that the neutron distribution is 
determined in the form 


gti% 








oti% ~ 
1 i, 1 Q(z) e™! du 
| = zu = BN a rn 
Vo) = 5 J KE) e du = EE (30.31) 
g—iæ g—io 
Let us find the transform 7(z). According to (30.25) we have 
uM 
zn- Mt m)? es Oli m)2 1 —exp [—uy(1 + z)]_ 
1) mM J E AM [tz 
1 1 —exp [--uy(1 +2z)] 
~ 1 exp (— um) l +z é 602?) 


Substituting (30.32) into (30.31), we arrive at a somewhat cumbersome for- 








160 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


mula for the neutron energy distribution (more precisely, the distribution 
over the logarithm of the energy): 


atice 


I Bel O(z) e7 du > 
You) 2ri Jf 1 — [1 —exp (—uqy)]~! [1 —exp(—ug(1 + z))] /(1 +z) 


g—i% 





(30.33) 


The integration is carried out with respect to a straight line parallel to the 
imaginary axis and lying more to the right than all the poles of the integrand. 
The contour can be closed by a semicircle of infinitely large radius lying on 
the left of the straight line. Then the theorem of residues can be applied to 
the integral. As a rule, O(z) can be considered a function having no poles. 
Then the poles of the integrand are points defined by the condition 


(1 —exp (—uyy)) (1 +z) = 1 —exp(—ug (1 + 2)). (30.34) 


It is obvious that this transcendental equation has a root z} = 0 as well as an 
infinite set of roots z, having a negative real part. 

Finding the residue of the integrand of (30.33) and applying the theorem 
of residues, we get 


5 [1 —exp [—ugq(1 + 24 )] exp (uzy) Dep 
1 —exp (—uy) + Uy exp [-uy(1 + 2z,)] a 





Yolu) = (30.35) 


For large values of u, i.e. for neutron energies considerably smaller than the 
energies with which they are emitted by the sources, formula (30.35) is sub- 
stantially simplified. Since the real value Rez, <0, all the terms of the sum 
(30.35) for u> 1 are exponentially small in comparison with the first term 
corresponding to z, = 0. 

Hence instead of (30.35) one can write 


1 —exp (— um) OO) 
—exp (— Uy) + Uy exp (—Uyy) ` 





Vo) = 5 (30.36) 


By definition (30.28) 


co 


GO)= | Quydu=Qo, 
0 


§31 THE SPATIAL DISTRIBUTION OF NEUTRONS 161 


where Qp is the total number of neutrons emitted by the source per second. 
We see from (30.36) that Yp(w) does not depend on its argument. It is this 
simplicity of the distribution function in the variable u that makes this varia- 
ble convenient. 

The neutron energy distribution is of the form 


where the quantity £ is by definition equal to 


1 —exp (— um) + exp (~ um) um 
l — um A 





This quantity, as shown by simple considerations, represents the mean loga- 
rithmic neutron energy loss in one collision. 


§31. The spatial distribution of neutrons 


Let us now consider the important problem of the neutron spatial distribu- 
tion. 

We assume that the change in neutron energy in scattering on nuclei can 
be disregarded. This can be done, for example, for neutrons moderated down 
to an energy ~ kT (i.e. for thermal neutrons). If the energy change is dis- 
regarded, then the energy in eq. (30.4) can be considered to be fixed. If the 
sources emit neutrons into the medium steadily, then a stationary neutron 
distribution will be established in space. The distribution function can be as- 
sumed to depend only on the coordinates and direction of motion of the 
neutrons. The latter can be characterized by the unit vector 


V,=V/v. (31.1) 
The neutron distribution function satisfies the equation 


oft, vi) v v EN ,, 40) 
var IED STWO DAV Nae Paro (31.2) 


The integration in the collision integral is carried out only with respect to 
the angle, since the absolute value of the momentum does not change in a 
collision. 





162 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Dividing (31.2) by v/l, and introducing the notation 


l l 
a h e 
mie Ona (31.3) 
we have 
OF m=, i D : 
ait f= fwv vy) saa’ + E (31.4) 


Eq. (31.3) is similar to the equation for the distribution function of the mole- 
cules of a light gas diffusing in a heavy gas (§24). However, an important dif- 
ference is the presence of particle sources and the absorption of the particles 
in the medium. 

In the cases where the scattering can be considered to be isotropic 
wv, V’)=1, eq. (31.4) allows an exact solution. Proceeding from the exact 
solution of eq. (31.4), one can find the neutron density distribution in space. 
Comparing it with the distribution obtained in the diffusion approximation 
found in §24, we can estimate the accuracy of the latter. 

To obtain the exact solution, it is convenient to use the method of ex- 
pansion in a Fourier integral. Writing 


etkraryvijdV, k= 





= ~ike 
ACD ee z fe ik-r s(r) dV 





(2m)3 


we have 
(1+ik- v,/,) ¥(k, vy =) E v) dQ 


or 
y(k) +a fY, vaa’ 
HOG Y= gairik- vl) 





(31.5) 


Integrating (31.5) over angles, we obtain 


Ak) +a fuk vat a, 
[ok vpi: — y Tine OO 


§31 THE SPATIAL DiSTRIBUTION OF NEUTRONS 163 


The last integral is directly calculable 





=27 = 


f dQ f_sindd@ _ 20, 1 +ikl, 4r 
Itik-v,l, g 1+ikicosO iki, "1—ikl, kh a - 


Substituting this value into (31.6), we get 


J = eee eK) eee 
fvæ van (kl, /arctan kl,) —a CLD 


Now we note that the integral on the left-hand side of (31.6) represents none 
other than the Fourier component of the neutron density 


Mo) = far, vD a= —— fuck, vel anak- 


apd etek f væ vag, (31.8) 


where N(r) is the number of neutrons per unit volume of the medium. Sub- 
stituting (31.7) into (31.8), we find 


Ake dk 


re 31. 
kl,/arctan kl, —a GY) 


N(x) = feik- (va, vpae) dk = 


In order to obtain a concrete expression for the neutron density, let us 
consider the case of a point source 


S(r) = soô (r) . 
Then 


ayo È 


aca 








164 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS 


and 


NG) = le So peikreos® k? dk sin d0 dy _ 
v (27)? kl,/arctan kl, — & 





lt So T eik" k dk 
in (2m)? “co kl,Jarctan kl, — a 





‘Chas 


(31.10) 


The calculation of the integral can be carried out conveniently in the com- 
plex plane. Let us consider the integral over the contour shown in fig. V1.5. 


=- f eif2 z dz 


z/arctanz —- & ` 
The integrand in (31.11) has a pole of the first order at the point 
z,/arctanz, =o 
and a branch point 


Z=1. 


(31.11) 


(31.12) 


The contour passes by the branch point, the integrals over the large and small 
circles reduce to zero and there remains only the residue at the point z, and 


the integral with respect to the imaginary axis 


eifz z dz 


joo 
I= 2ni Res (z,) + J Jaena? 


Fig. VI.S 


§31 THE SPATIAL DISTRIBUTION OF NEUTRONS 165 


Estimates* show that at large distances from the source (r > /,) the second 
term is small in comparison with the first, and we shall simply drop the inte- 
gral over the imaginary axis. 

We denote the root of eq. (31.12) by i/L, i.e. setting 


il,/L = warctan il,/L . (31.13) 


Calculating the residue at the point (31.13), and substituting the value of / in- 
to (31.10), we obtain 


lL os (Gla 7) be erl 
oS t ee 


< (31.14) 
D Qn? w+ (I?/L?y—-1 7 


Let us consider the case of weak absorption l, = lœ The arc tangent in eq. 
(31.13) can be expanded in a series, and its approximate solution can be 
written in the form 


L’ =L=} DD; (31.15) 


Obviously, here / > /,,. Then (31.15) assumes the form 


359 eho 


MIS 4nl,.v á r 





forr > lc- (31.16) 


It is interesting to compare the spatial particle distribution (31.16) with 
the particle distribution in the diffusion approximation. In the diffusion ap- 
proximation the solution of Boltzmann’s equation leads, according to (24.17), 
to the coefficient of diffusion D=4l,,v. The distribution function M(r) in 
this case satisfies the equation 


pang) MO) =? (0), 


where 7 =/,/v. Its solution is exactly the same as (31.16) for /, < lœ Thus in 
the case of weak absorption it turns out that the exact solution of the kinetic 
equation is, to a high degree of accuracy, the same as the solution of the equa- 


* A.D.Galanin, Teoriya yadernykh reaktorov (Nuclear reactor theory) (Atomizdat, 
Moscow, 1957). 


1 +n, 
» 

Wi ae 
] 9; 


TM 


eee 


Å- 


ee 


y! 


eres 





166 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


tion of the diffusion approximation at large distances from the source r > /.... 
A numerical comparison of the solutions of the kinetic and diffusion equa- 
tions shows that the two solutions are practically the same at distances from 
the source larger than 2/... 

In the opposite limiting case of very strong absorption, when 4 ~ lẹ the 
equation leads to the value 


L~1.05h. 


This means that V(r) decreases substantially at a distance ~/, from the 


source. 
In conclusion we stress that the results obtained are of a general character. 


They equally apply to all particles moving in matter and undergoing scattering 
and capture, provided the interaction with the scattering centres has the 
character of a short-range interaction. 


§32. The kinetic equation for a plasma disregarding collisions 


The direct application of Boltzmann’s equation to a plasma calls for a cer- 
tain amount of caution. Charged particles in a plasma interact according to 
Coulomb’s law, so that the forces of interaction between them decrease rela- 
tively slowly with increasing distance. Hence it is necessary to consider those 
changes which must be introduced into the calculation of the collision inte- 
gral in order to take into account the specific properties of the Coulomb in- 
teraction. However, we shall defer this consideration to §33, and in the 
meantime we note that although the presence of long-range forces may to a 
certain degree complicate the problem, it also in one way simplifies it. Name- 
ly, since the forces of interaction slowly decrease with increasing distance, 
collective motions in which relatively large groups of particles take part must 
arise in a system of charged particles. 

One may speak of collective excitations in a plasma in which the system as 
a whole is involved. In considering such large-scale motions one can disregard 
non-uniformities (fluctuations) in the system and pair collisions between 
particles. 

From the theory of scattering in a Coulomb force field (see §43 of Part I 
and §86 of Part V) we know that a considerable deflection of particles occurs 
for minimum values of the impact parameter 


Pin ~ e/m? ~ e7/é , 


§32 KINETIC EQUATION FOR A PLASMA 167 


where €~ kT is the mean energy of particles with temperature T. If the fol- 
lowing inequality holds 


Pmin PYN, (32.1) 


where F is the mean distance between particles and n is their density, then the 
role of pair collisions becomes relatively unimportant in comparison with the 
Coulomb interaction of particles at large distances of the order of F. In this 
case the character of the distribution function for a system of particles with 
Coulomb interactions will be mainly determined not by pair collisions, but by 
the average forces acting on the particles. It is useful to note that inequality 
(32.1) is equivalent to the inequality 


a (32.1') 


This means that for the collision integral to be small it is necessary that the 
plasma parameter be small (see §41 of Part IV). This requirement is in its 
turn equivalent to the approximation of Debye’s theory for an equilibrium 
plasma. Thus in the Debye approximation in which the equilibrium proper- 
ties of a plasma are usually described it is possible to take into account col- 
lective interactions and to disregard pair interactions at small distances. 

Inequality (32.1) or the equivalent inequality (32.1’) is valid for low den- 
sity and high temperature plasmas. 

One can arrive at the same result by means of another argument based on 
the approximation of the relaxation time 7. If, in a plasma, one considers 
non-stationary processes with frequency w, then for 


w> l/r 


one can disregard the collision integral. Dropping the collision integral in the 
kinetic equation, one can write 


Y F, f; AA 
EtV Vate Vyfa=0, (32.2) 


where the index a denotes the type of particle (electron, ion), and F, is the 
total force acting on a particle of type a 


F =e, (E +h [vx H) . (82.3) 





— 


ee ee 


168 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Here we assume that u = 1. The fields E and H in (32.3) represent the fields 
of all the other particles (internal fields) acting on the particle plus the ap- 
plied external fields. For simplicity we shall first discuss the case where ex- 
ternal fields are absent. The fields E and H satisfy the system of Maxwell— 
Lorentz equations 


_10E , 47. 
WEA aera a. 


_— | Oa 
Sere (32.4) 


V-E=47p, 
V-H=0. 


We stress that in our approximation we cannot consider the plasma as a con- 
tinuous medium and use Maxwell’s equations. 

The charge density and current density involved in the system of Maxwell— 
Lorentz equations (32.4) represent the mean values of these quantities taken 
with respect to the distribution functions fy: 


p=) ea [fary dV, j= Deg fVfalt.V, NAV. (32.5) 


Thus the set of field equations and distribution functions forms a closed sys- 
tem; the fields E and H are found from given mean values of the charge and 
current densities. The latter depend on the distribution function f,. Thus it 
can be said that the fields E and H are defined by the distribution function 
f,, which in its turn, according to (32.2), depends on the values of the fields E 
and H at each point of the plasma at each instant of time. In other words, a 
spatial and velocity distribution of particles is established in the system such 
that the corresponding fields maintain this distribution. Each particle moves 
in a field produced by all the particles except the given one. Of course, in this 
respect all the particles are equivalent. Thus a self-consistent field is estab- 
lished in the plasma (see §41 of Part IV and §70 of Part V). 

As a rule, one can disregard the thermal motion of ions because of their 
large mass and seek the distribution function just for the electrons. Then the 
whole system of ions forms a positively charged background compensating 
for the electric charge of the electrons. 


§33 THE DISPERSION AND DAMPING OF PLASMA WAVES 169 


Eq. (32.2) is called the kinetic equation with a self-consistent field. It 
should be noted that the qualitative reasoning underlying its derivation can 
be supplemented by more strict quantitative considerations. It then turns out 
that the kinetic equation with a self-consistent field is obtained from a set 
of equations for correlation functions by expanding in terms of a small plasma 
parameter r/Ip *. 

The collision integral turns out to be a quantity of the next order of small 
quantities in the expansion in terms of this parameter. 

The self-consistent field approximation for non-equilibrium processes in a 
plasma allows one to discover a number of phenomena which occur in it. 


§33. The dispersion and damping of plasma waves 


We have already considered the theory of plasma waves in Part IV. We ob- 
tained the law of dispersion for plasma waves in the approximation where the 
plasma can be considered to be a continuous medium (i.e. in the hydrody- 
namic approximation). Here, however, we can considerably refine the macro- 
scopic theory and, in particular, find the law of dispersion and damping of 
plasma waves. 

Let us consider the high-frequency natural oscillations of a spatially uni- 
form plasma in the absence of an external electromagnetic field. The kinetic 
equation for the electron distribution function in the self-consistent field ap- 
proximation is of the form 


ey 24 (Esti x ny) = (33.1) 


Heavy ions are considered to be at rest. Assuming the field to be weak, we 
shall try to find the solution of the system of equations (33.1) and (32.4) in 
the form 


f=foth,, (33.2) 
where fo is the electron distribution function in the absence of oscillations, 


* Note. See, for example, N.N.Bogoliubov, Problems of dynamic theory in statistical 
physics, in ‘Studies in statistical mechanics’, Ed. J.de Boer and G.E.Uhlenbeck, Vol. 1 
(North-Holland, Amsterdam, 1962). K.P.Gurov, Osnavaniya kineticheskoi teorti (Foun- 
dations of kinetic theory) (Nauka, Moscow, 1966). 





170 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


and f; <fp. Substituting this expression into (33.1), we easily find for fi 


of; af; l 1 je 
a eae E +z [v- HJ Hy 0% (33.3) 


Here f} is a function of coordinates, velocities and time, i.e. 


if (ESAS 


The kinetic equation without the collision integral can be simplified consider- 
ably. In this approximation it can be assumed that all the particles move in 
defined trajectories under the action of the self-consistent and external fields. 
Hence, if rg is the coordinate of the particle at the instant of time ¢ = 0, then 
one can write 


t 
r=ro+ fvat, (33.4) 
0 


where V(t) is defined by the equation of motion of the particle, 


t 


v()=Vvo+ | 2 dr. (33.5) 
0 


We introduce as new variables the values of rg and vo. Then 


OE rO 
CU oso Noa ay Or NOt) 4. 8Y NOt) ro,vg 


T9,Vo 


of, G) ð 
w (24) Pe ee Hie it 
ðt ly or m ov 


Here we have assumed that the derivatives (0r/0t),, y and (v/d rovo are 
the same as the velocity and the force for unperturbed motion. This is accu- 
rate to within quantities of the second order of smallness. Since the quantities 
mentioned are multiplied by the derivatives of the perturbed distribution 
function f}, such a replacement is legitimate. Then the kinetic equation (33.3) 


§33 THE DISPERSION AND DAMPING OF PLASMA WAVES 171 


takes the very simple form 


Af, (Lo. Yo t) r eE(ro, £) j dfo E 
ðt m av(t) 





(33.6) 


For an isotropic plasma fọ does not depend on angles. Hence in the expres- 
sion for the force only the electric force eE* need be retained. This equation 
can be integrated directly. If it is assumed that f}(ro, £) > 0, for t> —©, 
then, integrating (33.6), we have 





t 
= e FRNA 
fi m Je E(t, r’): a jé “Mm ef E(r - yo ) dt. (33.7) 


We stress once more that the possibility of using eqs. (33.4) and (33.5) and of 
introducing new variables is associated with the absence of collisions. Colli- 
sions perturb the motion of particles and deprive one of the possibility of 
using the concept of motion along trajectories. Hence the solution (33.7) is 
valid only for a plasma without collisions. 

Let us expand the field: E in a Fourier integral. Then for fı we have 


i . A , Of 
£ £ fak fae ÍI E(k, w) cilk- wt) e-ikv (i) . = dt’. (33.8) 


The correction to the distribution function due to an individual field harmon- 
ic is of the form 


t af, 
kw). _ & i(k-v—w(—p). 0 gy! 
ip ae E(k, w)e avn (33.9) 


Knowing the correction to the distribution function, one can find the corres- 
ponding electric current j 


i= pre dv = 
=H p(k, w) f dv f O e evilk-vi-wlt-1)] gy! (83.10) 


* The product [vX H] - a ay XH] wy 20 = 0, 





172 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


We introduce the new variable 7 = ¢ — ¢’. Then 


dfe d 
een 0 gi(k-v(t-7)+ wr) 
n=- "EX, w far fyo Gao Dtondr, (33.11) 


Since the time ż is not specified, one can set ¢ = O in the integral, and the cur- 
rent generated by one field harmonic turns out to be equal to 


dfo 
je -2 ET Elk, w) fav fv YOT yetreiewrdr. (33.12) 


We stress that here fọ denotes the electron distribution function in the plasma 
in the absence of oscillations. Generally speaking, fo is not an equilibrium dis- 


tribution function. 
If, however, the plasma is considered to be an equilibrium plasma, then fp 


represents a Maxwell distribution (normalized to unity), and 


Of _ OfM _ mv; (33.13) 


du; ðv KTM: 


Then formula (33.11) takes the form 


co 


=e ET E (k, w)eik-T-w0 Sa f yO mel" fdr. 6314) 
0 


fe 


In §36 we shall discuss this expression which is of a general character and 
relates the current to the velocity correlation at instants of time separated by 
the interval 7. However, in the case considered, where external electric and 
magnetic fields are absent, the particles move with a constant velocity, so that 
v(0) = v(7) = const. Hence we shall drop the value of the argument of the 
velocity in subsequent formulae. 

Formula (33.12) gives the following expression for the electrical conduc- 
tivity tensor defined by the relation j;(w) = 0;,(w) E(w) 


2 
= oy=-— fav Sri se Wr dr. (33.15) 


According to the general formula (33.20) of Part IV, the dielectric permeabil- 


§33 THE DISPERSION AND DAMPING OF PLASMA WAVES 173 


ity tensor Ejj is of the form 





47 . 4ne? n ar 
TO U l o fa vfus ees K-w)7 dr (33.16) 
or, for an equilibrium plasma 


nee n : rs 
e~ 5; oe Sti av f vpjelo-k-v)r dr = 





=6 +i anes =e vrar), (33.17) 


where Av { } denotes averaging over the Maxwell distribution. 

We choose the direction of the vector k to be the z-axis. It is then easily 
seen that in an isotropic plasma, which we are considering, only three compo- 
nents of the tensor e;j are different from zero: those perpendicular to the 
direction of propagation of the wave 








ae 2 
ante a i or D (i eozer ar) = 1 i Ste Ay) 
\ò (83.18) 


and the component parallel to this direction 








a 3 .4ne?n / 2 aa ap ES ae. n 
emes l+i kT l lec vt dr) =1 +i o “> (33.19) 


where ¢ ) denotes the average over the velocity component v,, so that 





4 co coun pk 
d= bEz) J exe (mv?/2kT)) ( i Cue Sare) dv,, (33.20) 


co 


12- (r) S 02 exp (= (me? /2k7)) (f eles) dv, (33.21) 


=o0 0 


Knowing the longitudinal and transverse components of the dielectric con- 





174 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 

stant tensor, one can find the law of dispersion of the longitudinal and trans- 
verse waves (see §33 of Part IV). Namely, the law of dispersion of the longi- 
tudinal waves is defined by the condition 


€= 0, (33.22) 


whereas the law of dispersion of the transverse waves is obtained from the re- 
lation 


x k2c2 P 


(e3) 








2 2,2 
e Meare) O (33.23) 
© w 
We shall begin with a consideration of the dispersion of longitudinal waves. 
Taking into account (33.22) and (33.19), we find the transcendental equation 
connecting w and k: 


4ne2n 
i = 2. 
l +i kT U)=0. (33.24) 





Let us find, first of all, the integral with respect to the variable 7. For this 
integral we have a definite meaning, it must be considered as the limit 


oo oo 


I al exp [i(w — kv,) 7] dr = lim I e- exp [i(w—ku,)7] dr. 


An arbitrarily small value of y ensures the convergence of the integral at the 
upper limit. We then obtain 


co 





i 
i(w — kv,)7 dr = lim ———_—.. 33.25 
ox [i (œw — kv,) T dr lim O ( ) 
We now have to calculate the mean value of (33.25), i.e. the integral 
(ya eit E = aes j ee 
O i, [2 _ 2) mk e» 77 
0 kvp \kuo vo (33.26) 


§33 THE DISPERSION AND DAMPING OF PLASMA WAVES 175 


where z = (w + iy)/kug, U9 = (2kT/m):. Instead of v, we have written simply 
v. We make use of the identity valid for Im z > 0 





i = i(z—x) 
rae f e xda 
0 
We then find 
1 ie 
US ee f e=? f el@—X)a dx da = 
(T)? kvg co ò 


co co 
Ze hae if eiza—}a? da if e-&+4a)* dx = 
L 
(7)? kvg 0 “oo 


2 co 
f e-Ge-iz) da= 
0 


A 





l- 


> 
g 
(e) 


co 
I eiza—}a? da = e 
o kvg 


2 


= oo miz 
a (J tar- J e= ar] = 





mI 








TOA ( Din pee ) KD ile eee 
= e 1+ ew dw} ==—e-2 +e? e’ dw. 
kv (m) J kvg kvg J 


o 


In this last formula we can pass to the limit y > 0. We then finally obtain 


(m): E E A 
= (T)? = (w/kvo)? q, 41 .—(w/kvo) w2 7 
= Foe Tar J ew dw. (33.27) 


It is obvious that w/k = Cpp is the phase velocity of propagation of the waves, 
so that z = w/kvg represents the ratio of this velocity to the mean velocity of 
thermal motion of the electrons. 





176 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


For Cph > vo we can write 


z r4 zZ 2. 
few? dw=e7? f e-tre? dr= e? f e-z Dy Aue 
é n=0. n! 

0 0 0 

~~ et* at 5,8 me 

le 2 Je F dt zz (i+ au Ne 
Hence 

1 a 2 
~ OO? ,-(whkvo)? , i i l E ] 
WE t lts TFE wE (33.28) 


To calculate the mean value in formula (33.21) we make use of the property 
of the integral /,) which follows from its definition (33.25) 


(w—kv,) 1, =i 
or 

vit) = =n = 5 5 
Analogously 

(ce ~ kv)? = ilw — kv,) 
or 


2 F 
wl, |; ikv. 
Dy n p SY, 
Hi ara a a (83.29) 
Averaging (33.29), we find 


2 > 
=n ye _ iw 
= o= r — 


§33 THE DISPERSION AND DAMPING OF PLASMA WAVES 177 
Substituting the value of “/,) from (33.28) for Cph” Up, we finally obtain 


o2 (ap! eho? 4 | 


2S ee = r (83.30) 





(= (kvo)? 


Z 


Iy) 


2w2 


Substituting the value of </>) into (33.24), we arrive at the following law of 
dispersion for longitudinal waves: 


hi 





UT wo 2 
2(T) Wo (2 joua a 


kvg w2 


=0, (83.31) 


w2 


where 9 = (4ne?n/m): is the plasma frequency (Langmuir frequency). 
In the same approximation the solution of eq. (33.31) reads 


w ~ wohl + id), (33.32) 


where the decrement of damping ô is equal to 


Ti- 





2 

{Xo Wwre = un)? x - 

5 =(n)t (4) er wollen)” = Ge T alkipy | (33.33) 
kvg 2V2 (klp)? 


Here /p is the Debye length. 

In §42 of Part IV, where a plasma was considered in the continuous- 
medium approximation, we established that limiting waves of frequency wp 
and phase velocity Con = Wwo/k exist in the plasma. The damping of plasma 
waves is a new result. At first sight this result may seem paradoxical. We have 
seen before that dissipative processes are associated with molecular collisions 
and the resultant momentum transfer. The damping of plasma waves (the so- 
called Landau damping) found above is of a different nature. 

Electrons in a plasma may have velocity components v, both smaller and 
larger than the phase velocity of the wave, cpp. In the first case the particles 
are acted upon by the field of a wave moving faster than themselves. Trans- 
ferring momenta to the particles, the wave carries them along. On the con- 
trary, those particles which are moving more rapidly than the wave lose their 
momentum, transferring it to the wave. Only particles whose velocity vz = Cph 
are in resonance with the wave. They move in phase with the wave, neither 
losing nor gaining momentum. The longitudinal wave tends to distort the 
Maxwell distribution, producing a peak corresponding to a velocity v, equal 
to the phase velocity Cph: 


178 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


It is known, however, that in an ensemble of particles having a Maxwell 
distribution the number of particles with velocity lower than a given velocity 
is larger than the number of particles with velocity larger than the given veloc- 
ity. Hence the number of particles carried along by the wave exceeds the 
number of particles transferring momentum to the wave. As a result of this, 
a damping rather than an intensification of longitudinal waves takes place. 

Formula (33.33) shows that the damping 6 is small for wavelengths sub- 
stantially exceeding the Debye length. On the contrary, the existence of plas- 
ma waves with À< lp is impossible: the damping factor for such waves be- 
comes larger than one. 

This fact shows once more that longitudinal waves in a plasma represent a 
collective effect associated with the Coulomb interaction of charged particles. 

Completely analogous calculations, based on the use of eq. (33.23) and of 
the expression for /;) lead to the following expression for the frequency of 
transverse waves 


w= wr tke? > wh. (33.34) 


Their phase velocity turns out to be larger than the velocity of light in 
vacuum 





w 2g iy 
Coh= a= ( + 3 (33.35) 
ph k c2k2 


Hence in order to calculate the damping factor of transverse waves, one 
would have to take into account relativistic effects. However, it is clear that 
the damping effect is small, since the phase velocity of the waves is so large 
in comparison with the velocities of any electrons in the plasma that all the 
electrons can be considered to be at rest. 

We shall not dwell here on the problem of the behaviour of a plasma in an 
external electromagnetic field, which is very important in practice. The 
method of integrating the kinetic equation in the presence of an external måg- 
netic field does not differ in principle from that presented above, but re- 
quires more cumbersome calculations. 

For this reason we do not consider the problem of plasma waves taking 
into account the motion of the heavy ions. 


§34 KINETIC EQUATION FOR A PLASMA 179 
§34. The kinetic equation for a plasma taking into account collisions 


The consideration of a plasma in the self-consistent field approximation 
turns out to be inadequate for describing a number of plasma processes. 

Such processes are, first of all, relaxation processes (the establishment of 
a Maxwell distribution) as well as the levelling of the mean energies of elec- 
trons and ions. The self-consistent field approximation is also inadequate for 
calculating kinetic coefficients; coefficients of diffusion, of viscosity and so 
on. 

For a sufficiently rarefied plasma one may take into account only pair col- 
lisions and make use of the ordinary Boltzmann kinetic equation but with a 
modified collision integral. This modification is associated with the proper- 
ties of the Coulomb interaction. Since Coulomb interaction forces decrease 
slowly with increasing distance, the major contribution to the collision inte- 
gral is given by collisions in which the particles are scattered at small angles. 
Indeed, for the Coulomb interaction the cross section diverges for small scat- 
tering angles (see §43 of Part I and §86 of Part V). 

In small-angle scattering the change of momentum of the colliding particles 
is small. Hence it can be said that the main role is played by collisions in 
which a relatively small momentum transfer takes place. This can be used to 
transform the collision integral. 

It is convenient to introduce new variables into the expression for the col- 
lision integral 


T= | WPi P, P3 Po) PAP) —AP) MP] dp; dQ. (34-1) 
We assume that the change in the momentum of the first particle is 

P2—-P=q. (34.2) 
From the momentum conservation law it follows that 

| ad Ste et | 
We carry out a change of variables, setting 

P2>3(P2+P); = P3 7 3(P3 + pı); 


Pı > P3- P1; p> P2—-P.- 


[Ap + DA -AP AP) ~ (4 2h pst 


180 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 
Then, obviously, 
W(P3: P2; Pi, P) > WÊ(P3 + P1), 3(P2 + P); P3— P1, P2—P) = 

=w(p +}q; p; —4q;-q;9)- 


Since the transition probability is invariant under the replacement q > (— q) 
one can simply write by virtue of the principle of detailed balance, 


W(P3, P2 Py, P) > W(P +4q, p; —24, 9). 


For such a change of variables we have 

SP3) (Pz) = f(p + 4) pı — 4) - (34.3) 
Then the collision integral assumes the form 
1= [wp +44, p34, a) NP + AP; -9 —AP)Ap)] dp, dR. (34.4) 


We now make use of the fact that the relative momentum change q in a col- 
lision is small, and expand the distribution function fand the transition prob- 
ability w in series in powers of q, retaining in the expansion terms no higher 
than the second order of small quantities. We then obtain 


i 
2 apiap* 
1 Mf, 
D Stine 

ap, OP} 





ð 
fp + q) =f) + 55-44 qk» 


of; 
fP1—D=~fi(Pi)~ ap 1t Vidk >» 








p 

a2f af; af af 

+ ( eee itl ap -2 — ae 
dp'dp apap; p op 


1 Pi: Z ow _ aw) 4 = 
w{P +24, Pj —24, q) ~ W(P, P1) + p ap) 24 


D ow ðw), 
wN pe 


§34 KINETIC EQUATION FOR A PLASMA 181 


Substituting these expressions into the collision integral, we find 


af, =O, 
of 1 ð 5 1 
E ~Sap,) Ga so apt tart da- 
1 





ð of af 
JZ (a oS laap das 





ap! dp Py 
a2 dfi of, 9 
+ fw (r L 2 5) 44,4, 4p, dQ. (34.5) 
dp'dp ap OP} ap ap 


Bu virtue of the evenness of the function w with respect to the change 
q>-—q and the oddness of the integrand the first integral reduces to zero. 
The third integral can be transformed by integrating by parts: 


a a of 
SET (fie I r dan a, da= 


ap \ ap pi 
əfi ə dfi 
= -fw Ee Len pas g ) tiak ap, a2. (34.6) 
dp; op əp op} 


Since the distribution function decreases rapidly with increasing argument, 
the integrated terms reduce to zero when Ipil > °°. Substituting (34.6) into 
(34.5), we find for the collision integral 


3 ð, ð, a2 
r= [2 (r S ^)-w( A Sg Ni 
dap’ ap* ap} ap} ap ðp pi 





2 ars, ar 
2 Ti ry Aes Wis 2 l of 
apiapk © apiapt — ap! apk 


G) ð, 2 
=f lige (r of = p41) -wit oF + wf, eg Mad, do- 
lap’ ap, p ap'ap* 





ew (r ) 4q,a, 4p, dQ= 


1 
ap* apt 1 
a a af Vik 
“SE bir) os. 
ap" dp* apy 
of, qi dj; 
Tor wh na) a aa, (34.7) 
dp' ap} dp dp’ 


182 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


where the following notation is introduced: 
à əfi a 
fm J (75,2) soagdo, a0. cs) 
ep; op 


We have reduced the collision integral to the divergence of the vector j; 
representing a flow in momentum space. The meaning of this result will be- 
come clear if the results of §10 are taken into account. When a variable, in 
the given case the momentum, changes by small amounts, then the change in 
the distribution function amounts to a flow in the corresponding space, in 
our case in momentum space. 

The kinetic equation assumes the form 

P.O, PHO, F apn Hi (34.9) 
ðt or m op api 








This equation is called Landau’s kinetic equation. It is obvious that Landau’s 
kinetic equation is a particular case of the equation of slow processes (the 
Fokker—Planck equation). In the case given, the slow process is the momen- 
tum exchange between particles via the Coulomb interaction. 

A further simplification of the expression for momentum flow is obtained 
if integration over angles is carried out. For this we make use of the fact that 
the main role is played by distant collisions in which small-angle deflections 
occur. 

We introduce the tensor 


Oy =f dwaigyde. (34.10) 
By means of this tensor the expression for j; can be written in the form 
of; af 
i= [org (oh =r) dp, . (34.11) 
i f i apt JA 1 


1 


In §43 of Part I we calculated the change in momentum for a small-angle 
deflection. If the direction of relative velocity of two colliding particles is 
chosen to be the x-axis, then we obviously have 


a 162 2ejez 
EO DE i) ar a 





(34.12) 


$34 KINETIC EQUATION FOR A PLASMA 183 


where p is the impact parameter, and e}, e3 are the charges of the particles. 
In the general case the vector q can be written in the form 


q=—— P. (34.13) 


Here it is obvious that the vector p is perpendicular to the relative velocity 
vector Vie Using (34.13), we write (34.11) as follows: 


= 2(e jez)? WP;iPk 
Qi, = —z— | —y f2. (34.14) 
V el p 


The probability w of a collision with scattering into angle dQ can be written 
in the form 


wdQ=dw=v,,,p dp dy, (34.15) 


where y is the azimuthal scattering angle (the angle specifying the direction of 
pin a plane perpendicular to the vector V,,, containing p. 
Taking into account this value of w and (34.14), we obtain 


ie 2(ejez)? f PiPk ioi 


Q; 
k 
i p? 


Urel 
In the coordinate system which we have chosen it is obvious that one can 
write 


P0 Py=psing ; p,=pcosy. 


so that for the components a;, we finally get 


Qn(eyer)? n d 

Aex = ey = Az = Qy =, Soy Gn =a f = (34.16) 
The integral over impact parameters diverges logarithmically at the upper as 
well as the lower limit. The limits of integration (the values of the parameters 
Pmax 2nd Pmin) can be determined from the following consideration. When p 
exceeds the Debye length /p, charged particles are screened and essentially do 
not interact. Hence the upper limit can be set as 2 max ~ lp- 





184 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


The lower limit is determined from the condition that scattering angles be 
not too large. Namely, if the kinetic energy Suv? is large in comparison with 
the potential energy e,e/p, then the deflections are relatively small. 

The limit of the region of small deflections is defined by the condition 


Viens 
ZHU el e1e3/Pmin 


or 
_ ~ 2e e,/pv- 
P min e 1ea/HV iel - 
Thus, finally, 
ameien)? Hvil 
m = C7 = aoa re 


It should be noted that the quantity whose logarithm is to be taken is very 
large, so that the numerical value of the logarithm itself is not very sensitive 
to the definition of the parameters Pmax and P min- 

In an arbitrary coordinate system, a; is written in the form 


2 K 
v“ Ô, — U U p 
l“ ik el “rel 
ajy = 2m(e ye)? “ar: In rea (34.17) 
Uiel min 


We finally obtain 


Fa Of ie Ope 
an aaa app 
af, 
apt 


Ff ) ap, . (34.18 
fi) pı - (34.18) 





2 tE 
max 0 (ee 
3 k 


= —2n(eje9)*1 
= —2n(e,e>)? In 
1 Pi, Op; 


tel 


In plasmas which consist of particles of several kinds, the Landau equa- 
tions should be written for the distribution function of the particles of each 
kind. We note, first of all, that for an equilibrium state the Landau equation 
allows a solution in the form of a Maxwell distribution 


f= const eP7/2mkT , (34.19) 


§35 EQUILIBRIUM IN AN ELECTRON-—ION PLASMA 185 


Indeed, on substituting (34.19} into the Landau equation its left-hand side 
reduces to zero. On the right-hand side we have 





2 fk 
Vee Dik — Viel Prel of of A 
f 3 = [lip k iE =f dp; = 
Viel apy 





ap* 
2 ek. yk S ok k i 
=f Pret? ik Vret rel (2 Sd ) ff\4 smy Pred ik Prl y d 
v3, mkT 171 kT Uel NSRIR 
re. 


However, it is clear that 


fet ik Yrei= O- 

Thus the solution of the Landau equation for the case of equilibrium is a 
Maxwell distribution. It can be shown that the H-theorem follows from 
Landau’s equation. Hence Maxwell’s distribution is the only distribution es- 
tablished in an equilibrium plasma. Finally, the equations of plasma hydro- 
dynamics follow from Landau’s equation, and the corresponding kinetic 
coefficients can be found from it. However, we cannot dwell on these rather 
cumbersome calculations. 


§35. The establishment of equilibrium in an electron—ion plasma 


Owing to the substantial difference between the masses of ions and elec- 
trons, an electron—ion plasma represents a classical example of a system 
which can be in a state of incomplete equilibrium. 

Collisions between ions lead to the establishment of an equilibrium Max- 
well distribution with a certain temperature T® in time 7; Analogously, in 
time Tẹ an equilibrium electron distribution will be established which, accord- 
ing to what was said in the preceding section, is also a Maxwell distribution. 
However, the value of electron temperature 7) will differ from that of ion 
temperature T®, namely T©) > TO, 

A plasma with different ion and electron temperatures is in a state of in- 
complete equilibrium. After the lapse of a time 7 > Ti, Te, total equilibrium 
with a common temperature for the two kinds of particles will be established 
in the plasma. This equilibrium is established by means of energy transfer 


yy 
inal 
| Tht 


Hl l 
a H 


it 


Pen 


| 





186 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


from the electrons to the ions. The relaxation time 7 is defined by the 
equality 


de® 
=-= g-) 
ae > (35.1) 
where € =3kT, is the mean ion energy, and Q~) is the energy flow from 
electrons to ions. 

Obviously, we have 


de® af 4 
SS See A0) @= | ®© ©, 
ae at si fP apt feos pe 





since the ion distribution function f does not depend on coordinates. Ac- 
cording to (34.9) the distribution function satisfies the kinetic equation 


af ð (eD 

o e, (35.2) 
at `k 

apy. 


The collision integral if =0, since there is equilibrium between the ions. 
Substituting (35.2) into (35.1) and integrating by parts, we have 


KON fey 
- fe ok : > dp = fof Dj ap, (35.3) 
Py 


According to (34.11), the momentum flow from the electron component 
to the ion component can be written in the form 


je = = fay lO af - 0 ) dp®. 


ap D ap (e) 





(35.4) 


Since f© and RI are equilibrium distribution functions with temperatures 
T® and T®, we find 


v® v®© 





j= fa APR 


dp®. : 
kT® =a) Cee) 


§35 EQUILIBRIUM IN AN ELECTRON-—ION PLASMA 187 


We introduce the relative velocity 


OME) 
vi Uy vi. 
Then 

i 


U 
i& i) ©) AD} © l ay, el dp© 
= fox Sf K [» i (zo zo) M paf: 


However, we have 





E es eae 2E 
Oe Prot = Viel Viel Dj — Pre rel O - 


Hence we find 





j@d= 2ne4Z? In Pmax (= 
¢ k Prin \T® 2) 


Since the velocity of the electrons is large in comparison with the velocity of 
the ions, we can set 


vk 
SER @ Vey Ure Oki Ys “rel Vel Vl ap, 


val 


Vel” =v), 


And finally, 


(eh2_ (€) yo) 
(ei) _2ne'Z? TO_7O P ô jv ) re i 
j= e hes {le FO FO J j 


- n 
k TOT©  Pmin (v©)3 





v® dp (35.6) 
Substituting j © D from (35.6) into (35.3), we obtain 


a TO {6D ap = 


452 yO)25, yO © 
CLIO Nos ff yor VE Pi 43) o, 
T OE 
(vo) 35.7) 





k T® TO. Bom 
Obviously, we have 


vw v (u)? ô= (v2 (v2 A 


188 KINETIC THEORY OF GASES AND GAS-LIKE SYSTEMS Ch. 3 


Furthermore, in view of the spherical summetry of f? and To 
vw vi) w® vw =0 for k#j. 


Hence, finally, 





c0 472 : 2 3 - 
CEO ZA TOE TO nen f COP 10) O ap© ap , 
dt KLE OTOU min» yoo ° 
Subsequent calculations reduce to a simple averaging over the Maxwell ion 
and electron distributions: 





(0) 
w = 3kT™ 


- =? me) ) 
mo ’ ve a A 


Thus for de“/dr we get 





de® _ N2e4Z2, Pmax TO-TY 3kT® | mO \F 
dt" k Emn TOTO mO (ears) 


or 


dT TOTO 
reas 
where the relaxation time is 


3m (kT®)2 
T = . 
4(2m)? (m©) NZ2e4 In (P max/Pmin) 





It is useful to compare this time with the relaxation time of each of the 
plasma components. The calculation of the latter involves great difficulties, 
but a rough estimate can be found from the following considerations. Setting 
f=fotf in the kinetic equation, we have 


Ap ie 


of Tei) Tei 








§35 EQUILIBRIUM IN AN ELECTRON~ION PLASMA 189 


where in order of magnitude 





Urs a Das afo 4 
Te m3(KT)? dp? Yrel 


Thus 
T 2 
© ~ m(kT©)2 s 
O ~ mip rż 
T me(kT™) . 
The ratio of the electron relaxation time to the ion relaxation time is 


7® m® 


i.e. is indeed very small. 





The Time Correlation Function Method 
and Onsager’s Theory 


§36. The response of a system to a dynamic perturbation. 
Classical calculation 


So far we have discussed only one approach to solving kinetic problems; 
the method of the kinetic equation. The complexity of the basic kinetic equa- 
tion makes it necessary to pass over to Boltzmann’s equation for the one- 
particle distribution function for the actual solving of kinetic problems. 

As we have seen in the examples given and as will be particularly clear in 
the chapter devoted to solid-state theory, Boltzmann’s equation is a powerful 
method of investigating non-equilibrium processes. However, it allows one to 
obtain concrete results only for a limited class of systems. 

Another method of solving kinetic problems has recently been developed. 
In this method it has been possible to formulate physical kinetics in the same 
way as statistical physics. 

Let us consider a macroscopic system in a state of statistical equilibrium. 
The equilibrium properties of this system are described by the equilibrium 
density matrix or, in the classical approximation, by the Gibbs distribution. 
We now suppose that a small perturbation is switched on at the instant of 
time t> — o0, 

In principle there are two different classes of perturbation. The first of 
these is associated with the application of an external force field to the sys- 


190 


§36 RESPONSE OF A SYSTEM TO A DYNAMIC PERTURBATION 191 


tem; for example a time-dependent electric or magnetic field. We shall call 
such perturbations dynamic. When a dynamic perturbation is applied, the 
total Hamiltonian can be written in the form 


H=H )+H'(t), 


where H'(r) describes that part of the Hamiltonian which is associated with 
the action of the external force. Thus dynamic perturbations are of a micro- 
scopic nature. They change the Hamiltonian of each particle of the system. 

Perturbations of another class, often called thermal perturbations, are of 
macroscopic character and make sense only with respect to the system as a 
whole or to a macroscopic part of it. For example, when thermal or diffusion 
contact is established between bodies having different temperatures or differ- 
ent compositions, the states of each of the bodies undergo perturbation. 
However, such a perturbation cannot be related to the change of the Hamil- 
tonian of the individual particles. 

We shall subsequently discuss the action of dynamic and thermal pertur- 
bations on an equilibrium macroscopic system, restricting ourselves first to 
the quasi-classical approximation and then carrying out the quantum calcu- 
lation. 

It turns out that all kinetic coefficients, and consequently, also all trans- 
port coefficients, can be expressed in terms of one and the same quantity, the 
time correlation function. Thus in physical kinetics the time correlation func- 
tion plays the same role as that played by the partition function in statistical 
physics. 

Therefore let us consider a classical quasi-closed system characterized by 
the Gibbs distribution function 


lie 
Po= ze BHo (36.1) 


in an equilibrium state. 
The perturbed Hamiltonian function is written in the form 


H= Ho + H'(t), 
where H'(t) < Ho. To simplify the calculations, we set 


H'=—A[p(t), q(0)]6(0) - 


—EE a 





192 TIME CORRELATION FUNCTION METHOD Ch. 4 
This means that at the instant of time ¢=0 the system is acted upon by a 
certain impulse, whereas at t<O and ż > 0 the system is not influenced by 
any external action. We shall find the changes caused in the system by a small 


dynamic perturbation. 
The variation of the distribution function in time is given by the general 


formula (15.2) 
OP ry. fs 
apn ise}. (36.2) 


Assuming that the perturbed distribution function can be written in the form 
P=Potp', (36.3) 


where p’ is a small addition due to the perturbation (p' < pọ), we find 
ð f r + + 

Gy ={H'; Po} +{ Ho; p'} = —{A(P, 4): 5(0) po} + {Ho;P' } (36.4) 
or 


$= (Alp, a); Po} CÒ 


Integrating with respect to time, we obtain 
p' = p'(—) —{A(0); po} = — {A(0); p9}, (36.5) 


where A(0) denotes the expression A [p(t), q(t)] in which the values of the 
coordinates and momenta of the particles at the instant of time t=0 are 
taken. 

By definition of the Poisson bracket and by virtue of (36.1), it is easy to 
find 


3A Po aA ao) l (24 0H) 3A =o) 


{A(0); po} = Dl Ee as a ee =- Epo Aa a Op Oy 


But for any mechanical quantity depending on coordinates and momenta one 


§36 RESPONSE OF A SYSTEM TO A DYNAMIC PERTURBATION 193 


can write 





pe dA (24 dq , ðA a) . (24, n 
dt dq dt dp dt 


ðq dp ap ðq 
Hence we obtain finally 
p' = A(0) poJkT. (36.6) 


Knowing the change in the distribution function, one can find the change in 
the mean value of any quantity, B(t), describing a macroscopic subsystem, 
caused by the perturbation 


CABU) = f [BO (oo + 0") — BC) pol dr = 
= Ff AO BO poar = 77 A(0) BO). (36.7) 


Formula (36.7) defines the change in the mean value of an arbitrary quan- 
tity B under the action of unit impulse. We denote this change by yp4 and 
call it the response of the system 


ln AA 
vpalt) = Pw) B(t). (36.8) 
Then (36.7) can be written in the form 


(AB) = f opal- E) ar. (36.9) 


Let us now consider the very general case where the perturbation acting on 
the system is of the form 


H'=—A[p(t), a()] F(d) , (36.10) 


where F(t) is a certain given function of time. In the linear approximation, 
i.e. when the perturbation is small, the change in mean values under the ac- 
tion of the perturbation (36.10) can be considered as a superimposition of 
impulse perturbations, and instead of (36.9) one can write 


t 
(AB) = f palt- t) F(t) de’ , (36.11) 


194 TIME CORRELATION FUNCTION METHOD Ch. 4 
where the response yp 4, as before, is given by formula (36.8), so that 


t 


L (A(0) B(t— YF) de’ . (36.12) 


ET 


However, it is necessary that F(z) satisfy the very general requirement 
F(t>-—%)>0. (36.13) 


which means that before switching on the perturbation the system was in an 


equilibrium state. 
An important particular case is that of a periodic perturbation. In order to 
satisfy condition (36.13), it can be assumed that 


F(t) = Re lim eĉttiot, (36.14) 
6-0 


Formula (36.14) defines a function which is practically periodic for all values 
of time t except for > — when it reduces to zero. Then from (36.11) we 


obtain 


(AB) = Re xpa ei", (36.15) 
where 
xpa = lim f erie opa ar. (36.16) 
= 0 


This last quantity, representing the Fourier component of the response 
Ypa» is called the generalized complex susceptibility. We shall see below that 
in the case of the action of an electromagnetic field, this definition of sus- 
ceptibility is the same as that given in electrodynamics. The relations ob- 
tained have the same degree of generality as the relations of classical statis- 
tical physics. For small departures of the quasi-closed system from an equili- 
brium state, formula (36.6) defines the change in its distribution function, 
and formula (36.12) defines the corresponding change in the mean values. 

Relations of the type (36.11) are accurate in the sense that they do not 
depend on the actual physical properties of the non-equilibrium system con- 


§36 RESPONSE OF A SYSTEM TO A DYNAMIC PERTURBATION 195 


sidered. Thus formula (36.12) defines the deviation of the mean values of 
quantities characterizing the system from their equilibrium values when the 
system is acted upon by dynamic perturbations. It turns out that the quantity 
characterizing the response of the system to a dynamic perturbation is the 
correlation function ypy = (1/kT) 4 (0) B(A). Since the averaging in Ypy is 
carried out over the states of the equilibrium system, formula (36.12) allows 
one to find the mean values for the non-equilibrium system from the charac- 
teristics of the equilibrium system, The correlation function or, more precise- 
ly, the response function for the non-equilibrium systems considered plays the 
same role as the distribution function for the equilibrium systems. 

However, it should be kept in mind that the distribution function is of 
universal character. On the contrary, the response function depends on the 
nature of the perturbation (the quantity A(0)). 

Before going on to obtain formula (36.12) in quantum statistics and toa 
discussion of the applications of the general theory, we shall make one more 
remark on the essence of formula (36.12). 

Formula (36.12), as well as formulae for the calculation of mean values in 
statistics, makes sense only for sufficiently large systems (N >, V > «œ, 
finite N/V). Furthermore, it is necessary to consider the time ¢ to be arbi- 
trarily large, i.e. to know the response of the system to a perturbation a suf- 
ficiently long time after switching on the perturbation. 

The function yp, is invariant under the replacement f > — t: 


Ypa = pp AO) BO = FRAO) Bo», 


since B(t) = B(— t). On the contrary, (AB) for sufficiently long times is not in- 
variant under this replacement. This is obvious from simple physical reason- 
ing. If t* is the time lapse after switching on the perturbation, and ¢>¢", 
then <B(— t)) is the response of the system to a perturbation which has not 
yet acted on it! Thus the expression for (AB) turns out to be irreversible for 
sufficiently long times and for sufficiently large systems. 

The relations obtained can be rewritten in a more convenient form, if use 
is made of the condition 


q Vas s 
7 4 BO) = ( 47 A00) B10) ) =0. (36.17) 


which means that the mean equilibrium rate of change of correlation of two 
arbitrary quantities is equal to zero. This condition expresses the stationary 
character of the processes considered. 


<= 


196 TIME CORRELATION FUNCTION METHOD Ch. 4 


Taking into account (36.17), one can write 
l . 


and, correspondingly, 
an=- f (A(0) B(t— ¢')) F(t’) de. (36.19) 


We Fourier transform formula (36.19). We write 


Fwy= | F(etar, 
0 

a(o)= f (aBreiwtar= 
0 


Il 


t co 
peeve , t a re r —iwt = 
ar e Re (A(0) B(t — t') e-it dt 


co oo 


= IF J J du(A (0) B(u)) F(t — u) eit) e—iwt dt, 
Then (36.19) can be written as follows 
Kw) = yw) F(w) , (36.20) 


where y(w) is of the form 


co 


Ww) =-2e f iu 4(0) B(u)) du . (36.21) 
0 


Formula (36.20) defines the generalized transport coefficient relating the 


§37 EXTERNAL DYNAMICS PERTURBATION; QUANTUM CALCULATION 197 


Fourier components of the ‘force’ F to the change in the quantity B caused 
by it. 

It is natural to call Z(%w) the generalized flux corresponding to the force 
F(w). The mean flux Y(qw)) turns out to be connected with the ‘force’ F(w) 
by a relation which is a generalization of empirical relations well-known from 
electrodynamics, i.e. Ohm’s law, relations for the electric and magnetic sus- 
ceptibility, and so on. 

Formula (36.21) can be generalized to the case of vector forces or several 
forces acting on the system. Instead of (36.21), one can at once write the 
equality 


Ulo) = Yil) Fw) , 


where i,k run over the corresponding sequence of values (for example 
k=x,y,z or k= 1,2,3, ...).. The tensor y; is called the tensor of transport 
coefficients 


Vix(w) = — EF if e—iwu(4 (0) B,(u) du. 
0 


§37. The response of a system to an external dynamic perturbation. Quantum 
calculation 


Let us now consider the response of a system to an external dynamic per- 
turbation by means of the quantum equation for the density matrix (2.3). 

Let a quantum-mechanical system described by the density matrix ĝ be in 
a reservoir and subjected to the action of an arbitrary external field U(t) 
depending on time. We shall assume the external field to be sufficiently weak 
to be considered a small perturbation. Furthermore, we shall assume that the 
applied field satisfies the requirement 


U®>0 for t>—%>. (37.1) 


We write, first of all, the equation for the statistical operator in the ab- 
sence of an external field 


A 


in 22, AAO- (37.2) 





198 TIME CORRELATION FUNCTION METHOD Ch. 4 


When an external field U(r) is applied, the equation for the statistical operator 
assumes the form 


in P+ fp, R+ UO] =0. (37.3) 


It is convenient to pass from eq. (37.3) to an integral equation. For this we 
shall consider the term [p, U] to be a known quantity. Then formally (37.3) 
will represent a linear non-homogeneous equation of first order with respect 
to the function p. If this equation is supplemented by the initial condition 


P(t > œ) > Apo, (37.4) 
then the solution can at once be written in the form 


A t 
P=Pots S ENE E ax. (37.5) 


By direct substitution one can see that (37.5) satisfies eq. (37.3) and the 
initial condition (37.4). The integral equation (37.5) contains the small per- 
turbation U(x) and can be solved by iteration (successive approximations). 
Setting 


B=potp +P" +., (87.6) 
we have 


: t 
POS fi e- iH(t-xy [Bo; U(x)] eiH(t-x)ffi Ape 


co 


== J eth (py, U(e— r'y) eM ar’ , (37.7) 


nam 


where we set t =t—x. The higher order corrections p”, p” ... can be ob- 
tained in an analogous way. 

In the first approximation, formula (37.7) gives an answer to the problem 
posed. It represents the quantum generalization of the classical formula 


§38 RESPONSE A SYSTEM TO A THERMAL PERTURBATION 199 


(36.6). According to (1.8), the change in the mean value of any quantity des- 
cribed by an operator B is of the form 


(AB) = Tr (Ô, B) — Tr (Ôo, Ê) = 


= f etm Ty [po Ua- ry] EM ar. (37.8) 
0 


a 


Formula (37.8) represents a general expression for the response of a system 
to a small dynamic perturbation. It is approximate in the sense that it takes 
into account only the first order perturbation. However, in other respects it 
it applicable to arbitrary systems for any interaction between the particles 
forming them. As distinct from the kinetic equation, formula (37.8) gives the 
probability amplitude, i.e. contains the diagonal as well as the non-diagonal 
matrix elements of the density matrix. 


§38. The response of a system to a thermal perturbation 


A more complex problem is the response of a system to a thermal per- 
turbation which does not have the character of an external field acting on the 
system. Since the action of a thermal perturbation cannot be written in the 
form of an additional term in the Hamiltonian function of an individual par- 
ticle, the preceding calculations are inapplicable to this case. It turns out, 
however, that under certain restrictions for thermal perturbations a law of 
linear response of the type (36.6) can be obtained. 

Let us consider a macroscopic system in a non-equilibrium state. We shall 
characterize the state of this system by a set of macroscopic parameters X;. 
We shall determine the value of these parameters assuming their equilibrium 
values to be the origin. 

For a non-equilibrium system the values of the parameters x; will change in 
time, so that x;#0. We shall restrict ourselves to non-equilibrium systems 
sufficiently close to equilibrium. This means that the parameters x; can be 
considered to be small quantities. 

We introduce into consideration the time scale 
<T<T 


T micro macro * 


Here Tmicro are times on a microscopic scale characterizing the changes of 





200 TIME CORRELATION FUNCTION METHOD Ch. 4 


state of microscopic parts of the system, and Tmacro are times on a macro- 
scopic scale in which a state of total statistical equilibrium is established in 
the system. 

The subsequent discussion will be based on the hypothesis of the existence 
in the system of the so-called local statistical equilibrium which is established 
in an intermediate time of the order of 7. 

A system in a state of local statistical equilibrium is described by the dis- 
tribution function p!°° of the form of the local Gibbs distribution 


eet, { HOA n 
p = 7 OSD ase, a (38.1) 
The parameters x;, which are functions of time, change in times of the order 
Of t ~ Tmacro> and for £ © Traci Teduce to zero. Then plc > p&4 where p°4 is 
the equilibrium Gibbs distribution. Since the xs are small, in the local equili- 
brium Gibbs distribution they are taken at the value ¢= 0. 

-The mean values of the parameters x; can be calculated by the ordinary 
formulae 


af] aInZ 
x= 5 J xiexp [-(E@, q) + Axp/kT) at = — kT T G82) 





The local Gibbs distribution cannot be derived from arbitrary general theo- 
retical propositions. It must be considered as a hypothesis. The validity of 
this hypothesis is confirmed by numerous experimental facts. The dissatis- 
faction which arises from such a statement of the problem may be somewhat 
reduced by recalling the fact that the equilibrium Gibbs distribution is also to 
a certain degree a hypothesis, even if substantiated and corroborated by num- 
erous plausible considerations. 

The parameters A;, conjugate to x;, also change in times f ~ T macro: 

The local equilibrium Gibbs distribution allows one to determine the mean 
energy and entropy of the system (see §21 and §26 of Part III) 


men] DRG i 38.3 
e pp (E, x), (38.3) 
e Ds 38 
S(X;) = kin Q Xj) =F n T` (38.4) 


The entropy is a function of the parameters x;. In a state of equilibrium, 


§38 RESPONSE A SYSTEM TO A THERMAL PERTURBATION 201 


when x;= 0, i.e. in a time of the order of 7 
mum value 


macro the entropy reaches the maxi- 


S(0) = it InZ. (38.5) 


Formula (38.4) allows one to express the parameters A; in terms of S(X;). 
Instead of A;it is convenient to introduce the quantities 


X;= A/kT (38.6) 


so that 
zn E 
S(x) = ET InZ+ X;x;. (38.7) 


We shall assume that the parameters X; as well as the mean values X; change in 
times t ~ Tmacro: The values X;= 0 correspond to a state of statistical equili- 
brium. 

The Gibbs distribution, the mean energy, and the entropy of a system 
change in time. However, as can be seen from definition (38.3), the charac- 
teristic time of change of these quantities is of the order of 7 macro: 

Formula (38.5) allows one to express the parameters X; in terms of S: 


_ as 


Assuming the local Gibbs distribution to characterize the state of the system 
at r= 0, one can follow the development of the system in time. We shall con- 
fine ourselves to times £ < Taco: 

The development of a system can be characterized by the quantity _ 


I(t) =X}, 


representing the mean rate of change of the parameter x;. Making use of 
(38.1), we find 


I~ f %,(t) p©%(0) dr = 1 f %(t) exp [- (Ep, 4) + A(O) x,)/KT] dr. (38.8) 


202 TIME CORRELATION FUNCTION METHOD Ch. 4 


Since the parameters x; are considered to be small, one can write to a first 
approximation 


l=} f x(QeF@. OKT ar — DE MO Zaz X,(t) x,{0)e-E@.DIKT qr = 


= (0) — > D ALCO) œO) p(t) = GD + 22X,(0) wO) X(t) . 


For an equilibrium state, it is obvious that 
&())=0. (38.9) 
Further, we transform the mean value 
(0) X(t) = x 1) X(0)) = 
0 0 


= (x,(0)— f x(a) da, ž;(0)) da = œ (0) *;(0) — f &;(0)X,(a)) de . 
<f =f 


At the initial instant of time the values of the parameters x;(0) and their rates 
of change X,(0) are independent of each other. Hence their correlation re- 
duces to zero 
&,(0)X,(0)) = 0 
so that we have 
0 t 
&,(0) ¥,(t)) = — f &(0) x, (a) da = EONO da. (38.10) 
—f 0 
Hence 


I= 22 LyX O= Ly Li X,(t) « (38.11) 


The replacement of X,(0) by X;(f) is possible because the thermodynamic 
forces change in times ~ Tpyacro, Whereas the time ¢ in formula (38.11) is lim- 


§38 RESPONSE A SYSTEM TO A THERMAL PERTURBATION 203 
ited by the inequality 
t ST macro- 


Lig denotes the quantities 
i¢ 
ip eral (0) iKa) da. (38.12) 


The symmetry of L;, relative to the transposition of indices is obvious. We 
see that the response to a thermal perturbation is expressed by a linear law. 
It is the thermodynamic forces X, that give rise to the thermodynamic fluxes 
I;. Thus the coefficients L;, are kinetic coefficients. 

The fluxes are characterized by the set of symmetric coefficients L;,. The 
latter are defined by the correlation functions taken with respect to an equi- 
librium state of the system. This relation is of a general character, and in this 
sense the time correlation functions are the basic characteristics of kinetic 
processes in non-equilibrium systems. 

As an example, let us consider the coefficient of diffusion. The mean dif- 
fusion path in space 





D= pun aK (38.13) 
We write the displacement AR in the form 
At 
AR= f vdt. 
0 
Then 
i At At 
D= lim. Gar J dt I dr'v(t')v(t") . (38.14) 


204 TIME CORRELATION FUNCTION METHOD Ch. 4 


The mean value of the velocities does not depend on the zero of time. Hence 
one can write 





v(t') v(t") = v(0) v(t’ — t"). (38.15) 


Substituting (38.14) into (38.13) and carrying out one integration, we find 
directly 


At n” 
Ea E t " ” 
D= lim 4 f ( £) VOE de”. (38.16) 


If the interval Aż is microscopically small but nevertheless large in comparison 
with the characteristic times of molecular processes, then (38.16) can be 
written in the form 


At 


Dæ lim 4 f dr"(v(0) v(t") . (38.17) 
At-0 ð 


§39. The calculation of kinetic coefficients. The connection with Boltzmann’s 
equation 


The general expression for the response of a system to a dynamic pertur- 
bation, found in the preceding section, allows one in principle to find any 
kinetic coefficients for small perturbations of the system. As an example, we 
shall confine ourselves to the calculation of the action of an electric field and 
to the calculation of the electric conductivity*. 

We shall consider a quasi-gaseous system made up of a set of charged 
particles which do not interact with one another (particles of the first kind) 
and a set of particles of the second kind whose state does not change under 
the action of an electric field. For example, particles of the second kind may 
be neutral, or charged but too heavy for their state to be perturbed by a weak 
field. They may be charged and form a crystal lattice the state of which is 
also not affected by a weak field, etc. 

We have already considered such systems previously by means of Boltz- 
mann’s equation (see, for example, §28). However, it should be stressed that 


* See M.Lax, Phys. Rev. 109 (1958) 1921. 


§39 THE CALCULATION OF KINETIC COEFFICIENTS 205 


now we do not assume that the interaction between particles of the first and 
second kind is weak. On the contrary, the interaction may be arbitrarily 
strong, and may be described by any law. The Hamiltonian A contained in 
the formulae of the preceding section involves this interaction. The whole set 
of particles of the first and second kind forms a macroscopic subsystem which 
as a whole undergoes a weak interaction with the reservoir surrounding it. 

We suppose that the subsystem is acted upon by the uniform electric field 


E= lim Epeite! p 
a—>0 


Then the term U of the total Hamiltonian assumes the form 


U(t) = — lim eEp: rel!e%. (39.1) 


a—0 
Substituting (39.1) into (37.7), we obtain 


MAO -4 eE: if eate-iĤr'/n iSo, r] eift /he-alt-Deiwlt-r) gf! T 
0 


Passing to the limit œ > 0, we have 


co 


ie S -ifr fiyo pei Me—iwt gy! 
A) =—FeE J eM po, r]ePi/eiwr ay! , (39.2) 
Making use of operators in the momentum representation, we get 


eee Ee mL YM OVP OR aN tan 2005 
[ôo f] = Aof — fpo = ih (0 ap ap Po) Pree OP 
Hence 


x Seon SA War 
pi()=—cE- ciot f eife ap ie dt’. (39.4) 
0 


The electric current can be written in the form 


. S ar 


j=neTr(¥-p). (39.5) 


ie 


206 TIME CORRELATION FUNCTION METHOD Ch. 4 


The value of the trace does not depend on the choice of representation. We 
choose a representation in which the velocity operator is diagonal. Then the 
expression for the current can be written as follows: 


j=ne f vivip'ivyay , (39.6) 


where the matrix (v|f'|v) is given by the formula 


co x ap z 
aly) = -& Eneiwr- —iwtť e-iĝr fh 2 ifef arly) = 
v2 |v) = Epe" q J e e 3v È dt ly 


eEge! E y Âo 

pe —iwl (y|e—ifr fiy" ' ” 

_ JU) aie iv) (v ev x 

X "let fly) dr’ dv’ dv” . (39.7) 


Thus the electric conductivity is of the form 


2 co 
Sane ERO, ' ” 
UE are Taa dt’ fvav fav! fav" x 
“fp! n , dfo m n iriri 
X Wle-#tMiy'y Cy |v Ww" let /hjyy , (39.8) 


The expression obtained is as general as formula (39.2) for the change of the 
density matrix. It contains the diagonal (v’=v") as well as the non-diagonal 
(v'#v") matrix elements of the equilibrium density matrix fg. Therefore 
formula (39.8) has a wide range of applicability. 

For example, this formula defines the electric conductivity of a liquid 
metal or a conductor very strongly alloyed with admixtures, when the ap- 
proximation of Boltzmann’s equation is inadequate. 

It is very important to compare the theory of response to an external 
dynamic perturbation based on the exact equation for the density matrix with 
Boltzmann’s kinetic equation. To do this it is easiest to compare the kinetic 
coefficient we have found, i.e. the electric conductivity (39.8), with an analo- 
gous expression obtained by means of Boltzmann’s equation. For this we im- 
pose a further restriction on the generality of formula (39.8). Namely, we as- 


§39 THE CALCULATION OF KINETIC COEFFICIENTS 207 
sume that the interaction of charged particles (particles of the first kind) with 
particles of the second kind is weak. This means that in the first approxima- 


tion the system of charged particles can be considered to be an ideal gas. 
Then the density matrix Jp can be applied to an individual particle, if one sets 


=o LI exp (mô? 22xr)/ T [I Zi (39.9) 


where Ao does not involve the interaction of the given particle with all the 
other particles of the first and second kind. 

In the v-representation, in which expression (39.9) is written, the density 
matrix fp is diagonal. Hence 


0p ð 
e e (39.10) 


and, correspondingly, 


ne? a . 
g= J e=ior ar [v av f av’ x 


x See ovn (v 





op 
aly" ‘et Hot My) ay” = 


m 


Pi wets cea A ; 
== ih eziwtar fv dv fovje Moy x 
0 





ap (v" "n 
g ds — v") v" "eot yy ay’ ay” = 





Qe ae dalV" 
oy wt I e-iwl ar fv av | ww, v’,t') of Deep ` (39.11) 
0 


m av’ 


where 


wey, v’, t) = Kyle oyn. (39.12) 





208 TIME CORRELATION FUNCTION METHOD Ch. 4 


It is obvious that W(v, Vv’, £) represents the probability that a particle which, 
with velocity v at the instant of time ¢=0, will acquire in time ¢ a velocity 
v’. We compare the expression obtained with the general solution of Boltz- 
mann’s equation (28.7) in an external force field. 

This comparison allows one to convince oneself of their complete identity, 
provided that pọ is replaced by fp. The meaning of the density matrix of a 
free particle, pg, is indeed the same as the one-particle distribution function. 
Thus we see that Boltzmann’s equation for a homogeneous quasi-gaseous sys- 
tem indeed follows from the exact equation for the density matrix. 

The above reasoning allows a generalization to systems of particles inter- 
acting with one another. However, we cannot dwell on this here*. 


§40. Onsager’s theory 


For small departures from an equilibrium state, non-equilibrium processes 
in a closed system can be described starting from some very general considera- 
tions first stated by Onsager. 

We shall characterize the state of a closed system by macroscopic param- 
eters x;. These parameters are functions of time. 

For small departures from an equilibrium state, the parameters character- 
izing the state of the system can be considered to have a thermodynamic 
meaning. In other words, the x; should be understood to be the differences 
between the values of thermodynamic quantities in a given non-equilibrium 
state and those in an equilibrium state. We recall that in an equilibrium state 
all thermodynamic quantities have values equal to their means. 

It is clear that for large departures from an equilibrium state, thermody- 
namic concepts make no sense. However, as we have seen above, for small 
departures from equilibrium, use can be made of thermodynamic quantities, 
which are then not equal to their mean values. For this it is only necessary 
that an incomplete local equilibrium exist at each point of the body. For 
small values of x; all the quantities characterizing the state of the system and 
its rate of change can be expanded in series of powers of x;. In these series 
one need retain only the first terms, so that one can write 


Xj = QikXk> (40.1) 


* See M. Lax, Phys. Rev. 109 (1958) 1921. 


§40 ONSAGER’S THEORY 209 
S=So—4BipXjXx > (40.2) 
S=—BipxjXp - (40.2a) 


Formula (40.1) shows that all processes near an equilibrium state are slow. 
The entropy of a system in a non-equilibrium state is expressed by the qua- 
dratic form (40.2). From the minimum condition it follows that 


Bik = Bri» (40.2b) 


The increase of entropy, S, in unit time is also small. It is obvious that all the 
formulae given above can be applied to changes of state of the system in lim- 
ited times ¢. Namely, on the one hand, these times must be very large in com- 
parison with microscopic times Tmicro in order that one may speak of the 
change of macroscopic quantities. 

On the other hand, the system must be in a non-equilibrium state. If total 
equilibrium is established in it in relaxation time Tmacro» then the following 
inequality must be fulfilled 
<t<r 


T micro macro * 


We denote by /; the microscopic flux 
IENS 

Ii =X; 

and by Xi the microscopic thermodynamic force 


Xi = 0S/0x;= — bikXk - 


Then the preceding relations can be written in the form 


n; = 5 -]y' n , 
Li = iy Xp = — jbk Xk” VikXk” (40.3) 
S= So + 4X;x;, (40.3a) 
S=1;X;. (40.3b) 


Our further treatment will be based on the following hypothesis of On- 
sager: macroscopic non-equilibrium state near equilibrium can be considered 
as a fluctuation. Changes in time of states of a macroscopic equilibrium 





| 


—— 


210 TIME CORRELATION FUNCTION METHOD Ch. 4 


system undergoing fluctuations obey the same laws. Let, for example, non- 
uniform concentration and temperature distributions be produced in a 
macroscopic system. Then fluxes described by the corresponding macroscopic 
transport laws will arise in the system. If in an equilibrium system there occur 
concentration and temperature fluctuations as a result of which the same 
concentration and temperature distributions are produced, then according to 
Onsager’s hypothesis these fluctuations will resolve according to the same 
laws as those governing the levelling of concentrations or temperatures in the 
macroscopic system. 

The particle flux and heat flow will be defined by the laws of diffusion 
and thermal conduction irrespective of how the corresponding concentration 
and temperature differences arose; either as a result of a spontaneous fluctua- 
tion in the equilibrium system, or as a result of external actions which 
brought the system into the non-equilibrium state. 

Thus, according to Onsager’s hypothesis, the relation between fluxes and 
forces, i.e. the macroscopic law 


l= DLA Xp, (40.4) 


is equally applicable to non-equilibrium systems and to the processes of reso- 
lution of fluctuations. 

The mean macroscopic fluxes and forces /; and X% are obtained by aver- 
aging the microscopical fluxes and forces /; and X;, and the coefficients L jz 
and 7; are the same. It turns out that the kinetic coefficients Lj, can be ex- 
pressed in terms of the time correlation function. On the basis of Onsager’s 
hypothesis one can write 

dx; 
I(t) = ae LX (t). (40.5) 

We multiply (40.5) by x;(0), so that 
x/(0)X;= x/(0) LX (t) . (40.6) 


We now introduce into consideration an ensemble of identical systems dif- 
ferring in the given initial values of the parameters x;(0). We denote the mean 
over this ensemble by the symbol < ). 

We find then 


&(0) EZO) = Lj (0) X (0) > (40.7) 


§40 ONSAGER’S THEORY 211 


An ensemble of equilibrium closed systems at the initial instant of time forms 
a Gibbs ensemble. For this ensemble, the probability distribution can be 
written in the form 


W(X], --- Xp ---) dey... dey = Cexp [AS(x], «.., Xy-..)/K] dey... dey . (40.8) 


Then the mean values of the quantities involved in formula (40.7) can be 
written as follows: 


dx(t) y 
a7 a dx ~: dey. Dey 





(0) ¥,(D) = C | x,(0) 


and, correspondingly, 
KOXKA = C fx (0) X p(t) AS dx... dey . (40.9) 


It should be stressed that, through this, the problem of determining mean 
values in a non-equilibrium system for which the probability distribution is 
unknown, turns out, owing to Onsager’s hypothesis to be reduced to the 
problem of calculating the ensemble of means for a Gibbs ensemble of closed 
systems with a probability distribution given by formula (40.8). 

Making use of the quasi-ergodic hypothesis, one can write 


dx ;(t) $ 
(10) ae) = (0) X(t). 


Then (40.7) is written in the form 
œ (0) X(t) = Lix (0) XLA . (40.10) 


We transform the left-hand side of (40.10) in the same way as in a preceding 
section (see (38.9)): 


t 
& (0) ži = x A ž;(0) = (x10) + f %(@) da, 20) = 
0 


t 
= — f&a) ž (0) de (40.11) 
0 


= 


ee 
-37- 


nn ey 


212 TIME CORRELATION FUNCTION METHOD Ch. 4 
The right-hand side can be transformed in the following way. Since the mean 


force changes in times £ œ~ T macro» Whereas we consider the development of 
the system in times ¢ < 7 macro» We Can approximately write 


XKt) = X,(0) . 
Our problem is to find the mean 


(0) X,(0)) = (0) =) (40.12) 


To calculate the mean we can, on the basis of Onsager’s hypothesis, make use 
of the Gibbs distribution 


OW ƏS .as/k 
((0) E) c {x40 aaz T dxi o dey 
The integral with respect to x, can be taken by parts 


3 .asik z THO) ae 
c fx; eo”) a= -c J ax, eash dx, = — öp , 





since the integrand tends to zero rapidly at the limits. Thus, finally, 
Os 
hko? )- — kô jz. (40.13) 


Substituting (40.11) and (40.13) into (40.7), we find 


yo H 
Lig = i ( SOO da ) = Lj. 


We see that Onsager’s hypothesis leads to exactly the same expression for 
the kinetic coefficients as the hypothesis of a local Gibbs distribution (see 
(38.12)). This proves the equivalence of the two hypotheses. 

The symmetry of the kinetic coefficients L; = L,; has a profound mean- 
ing. If, for example, two parameters change and two fluxes arise in the sys- 


841 DISCUSSION OF ONSAGER’S RELATIONS 213 
tem, then from the symmetry property it follows that 
1, =14,X, +L 42X>, (40.14) 
In = 199X041 ,2X,- (40.15) 


Formulae (40.14) and (40.15) show that the force X} gives a contribution to 
the flux />, and the force X gives the same contribution to the flux 7}. Gen- 
eralization to a large number of forces and fluxes offers no difficulty. 

A great number of such cross fluxes is known in physics. As an example, 
we can point to thermal diffusion and the converse effect of the occurrence 
of a temperature gradient when gases at the same temperature are mixed. 
Other examples will be given below. 

The symmetry relation allows one to establish the general relation between 
such cross processes. The use of the symmetry relation makes it possible to 
describe a great number of connected effects. Agreement of the theory with 
experimental data is a conclusive proof of Onsager’s hypothesis. 


§41. Discussion of Onsager’s relations 


In this and the following sections we shall discuss some consequences of 
the symmetry of kinetic coefficients. Analysis of these consequences is the 
basic subject matter of the thermodynamics of irreversible processes. We shall 
show that from Onsager’s relations one can draw conclusions of general char- 
acter as well as obtaining results of more practical interest. The latter mainly 
consist of establishing relations between different non-equilibrium processes. 

We shall begin with some general consequences. We note, first of all, that 
Onsager’s principle can be obtained from Onsager’s hypothesis on the basis of 
the general theory of fluctuations. In view of the importance of this principle, 
we also present this more usual derivation. 

According to the principle of microscopic reversibility, fluctuations in a 
closed system are reversible in time, so that for the correlation function one 
can write 


(x(t) x(t + T) = O(t) x(t = T) . 


On the other hand, changing the zero time on the right-hand side, one can 
write 


œt) xlt ap T) = &(t + T)X,(t) 5 (41.1) 





214 TIME CORRELATION FUNCTION METHOD Ch. 4 


The symbol ¢ ) in formula (41.1) denotes an average over the ensemble. 
Averaging once more over time 7, we have 


(Dx ee +7) = xlt +T) xD” . 


Since the two averages are independent and equivalent, subtracting from this 
equality (x(t) x(t) , we get 








&/(t), x(t + T)—x;,(t)) = (0), x(t +7)— x/(t)) ° 
Dividing by 7 and passing to the limit 7 > 0, we have 
&/(t) X, (0) = (0) X)(t)) 2 (41.2) 


On the basis of Onsager’s hypothesis, relation (40.5) holds for fluctuations as 
well as for macroscopic processes. Its substitution into the last equality gives 


x), LyjXp = &y, LyX) - (41.3) 


But, according to (40.13), we have ()X;)=—k6);, &,x;)=—k5,; so that 
(41.3) gives L,;5);= £);5,;. Hence Ly; = Lig. In this derivation of the proof of 
the principle of symmetry of kinetic coefficients use was made only of the 
principle of microscopic reversibility and Onsager’s hypothesis (formulae 
(40.4) and (41.2)). However, here the meaning of the coefficients L;,, is not 
brought to light. 

In deriving Onsager’s principle we have actually assumed that the system is 
not placed in a magnetic field and does not rotate. As a matter of fact, if the 
system is in a magnetic field, under the change of sign of the time 7 > —7 the 
following equality holds H > — H. For the principle of microscopic reversibil- 
ity to be fulfilled, the Lorentz force must not change sign. Under time reversal 
the following equality holds x(7)=x(—7). Exactly the same applies to the 
angular velocity of a rotating body. Reproducing the preceding calculations, 
one can easily find Onsager’s principle to read 


L(A) = Lyf(- H) 2 (4 1 4) 


This equality always holds if the two parameters x, and x, are such that under 
time reversal one of them changes sign whereas the other does not. 

Of other general consequences of Onsager’s theory we point out the proof 
of the existence of a dissipative function for mechanical systems performing 


§42 ONE-COMPONENT SYSTEM 215 
in slow motion. One has to add to the equation of motion the components 
of the force Xg, which for small velocities can be expanded in a series in 
which only the first term need be retained 

X= Bind; - (41.5) 
According to (40.26), the tensor 6), is symmetric 

Bix = bri- 
There is no zero term in the expansion, since the system at rest is not acted 


upon by dissipative forces. Thus the generalized equation of motion is of the 
form 


U n 
3g, PRU: 


7 a 
mg, = — aay 


By virtue of the symmetry of 8;,), this equality can be written in the form 


ele Be 

mG, = Ep rA (41.6) 
where 

S=ABiKRG aK - (41.7) 


The quadratic form f represents the dissipative function. In mechanics, as a 
rule, the existence of a dissipative function is not proved. However, if the 
system were acted upon by several frictional forces, then without the prin- 
ciple of symmetry it would be impossible to introduce the dissipative func- 
tion. Also there is no analogue of the dissipative function for motion in a 
magnetic field. Although in this case the force is also proportional to the 
velocity, the tensor Bis antisymmetric. 


§42. Non-equilibrium processes in a one-component system 
We shall discuss in detail the theory of non-equilibrium processes in a one- 


component system in order to elucidate the method of finding thermodynam- 
ic forces and fluxes and applications of Onsager’s relations. The character of 








216 TIME CORRELATION FUNCTION METHOD Ch. 4 


the results to be obtained will enable us to understand more clearly the merits 
and shortcomings of the thermodynamics of irreversible processes. Let us 
consider first of all a one-component closed system consisting of two sub- 
systems with temperatures 7, and 7, pressures p; and po, internal energies 
E, and £} and numbers of particles M} and N3 per unit volume. Suppose that 
energy exchange and particle exchange between the subsystems proceed with- 
out the appearance of hydrodynamic fluxes. For example, the exchange could 
be carried out through a porous diaphragm. Let 6£; and ôN; be the changes in 
the energy and number of particles of one of the subsystems (i = 1, 2). Then 
the change of entropy is 


_ (28. as. 
5S;= (se), tEn (er ),,%i 
32s 2 as 32s 2 
—— p 2 = F 3 - À 
+} les ) CE) + 2 ap ay EAN + ( mn ) 6N) ] De 
i “Ni i Ei 


The total change of entropy of the closed system is equal to 








a Noan (Ae =) i 
65=55, +652=3 (2 *) 6D +3 (55) Om | ee 5N6E= 

[a(i 2 © ROZS r] 

éle» aw) ON) +2 pon ON OE | - (42.1) 


The linear terms dropped out owing to the laws of conservation of energy 
and particle number. According to (42.1), the rate of increase of entropy is 


equal to 
apy 2 L a (1 -an| 2-(4) 2.(4) = 
58 stl 2(F)ov+ (i)er | an 2-(4 oN +p (4) oe |= 
=5E5 (+) BSNS (4) (42.2 
7 =). ! 2) 


On the other hand, by virtue of (40.3b) one can write 
8S =IpXp+IyXvy , (42.2a) 


where Ip and Zy are the energy flux and particle flux, and Xp and Xy are the 


§42 ONE-COMPONENT SYSTEM 217 


corresponding thermodynamic forces. Hence we find 





m E E 
Ip= sE, Xg s(}) 72” 
ee = UNa UOD ROL 
Iy=6N, Xyv= 5(#)- T + FT (42.3) 


where v and h are the volume and enthalpy per particle. Taking into account 
the symmetry of kinetic coefficients, according to (40.4) the fluxes can be 
written in the form 


Tp=LyyXpt+LyXy, Ty = LyX ptl2Xy.- 
Substituting in the value of the forces, we find 


Lah- L Lav 
12 Isr- 12 








-E 3 OF 
E T2 T 5p > (42:4) 
Lah- Liz Lagt 
= 2 


Here 57 and ôp denote the changes in temperature and pressure in passing 
from one subsystem to the other 


ôT= Ti- T3; ôp =p- P2- 


It is obvious that the entire treatment makes sense only for small values of 67 
and 6p. The transport of energy and of particles leads to the appearance of 
cross effects characterized by the kinetic coefficient L 15. 

An important particular case is that of a process without particle trans- 
port. Then formula (42.5) shows that in a stationary state a pressure differ- 
ence and a temperature difference connected by the relation 


h—-Ly2h2 


F (42.6) 


are established between the subsystems. Formula (42.6) defines the so-called 
thermomolecular pressure. In another important particular case, that in which 


218 TIME CORRELATION FUNCTION METHOD Ch. 4 


the temperature difference ôT = 0, formula (42.4) shows that an energy flux 





[pe ae bp = 2 1 = ET 42.7 
E T Pp La N N (42.7) 


flows between the subsystems. This effect, called the thermomechanical ef- 
fect, shows that a certain energy Æ* is transported together with the particles 


E* = — Liz/L22 - (42.8) 
By means of (42.8) one can write 
E* 


bp =" ST. (42.9) 





Formulae (42.6), (42.7) and (42.9) are in character identical with the ther- 
modynamic relations. They establish very general and often not apparent re- 
lations between processes and the measurable quantities characterizing them; 
in the case given, 5p, ôT and E*. However, the values of these quantities must 
be determined experimentally or calculated by the formulae of kinetic theo- 
ry. In particular, the value of Æ* can easily be calculated for an ideal gas. It is 
easily seen that for an ideal gas E* = A, so that in this case 5p = 0 for 57#0. 

For systems for which the value of £* cannot be calculated, it must be de- 
termined experimentally. Experiment completely confirms relation (42.9), 
We note that the determination of forces and fluxes carried out above is not 
single-valued. 

It is clear that expression (42:2a) can be resolved not only in the way it 
was done in formula (42.3) but also in other ways. It can easily be shown7, 
however, that this will not affect the final formulae, in particular (42.9). 

Let us consider stationary non-equilibrium processes in a certain closed 
system. In stationary processes the state functions of the system do not de- 
pend explicitly on time. It turns out that the entropy production in station- 
ary processes has a minimum value. In order not to complicate our formulae 
in proving this statement, we shall confine ourselves to the case of two cross 


t See S.De Groot, Thermodynamics of irreversible processes (North-Holland, Amster- 
dam, 1952). In presenting the thermodynamics of irreversible processes we follow this 
monograph. 


§42 ONE-COMPONENT SYSTEM 219 
processes. The entropy production in a body can be written in the form 
P= [Sav= [1,X,aV=f D LypXX,dV = 
= f(b, X2 +1 2X, Xp + Ly R + Ly XP AV 
= J (L aX? + 2L XX3 + Ly X2}aV. (42.10) 
Substituting the values (42.3) for X, and X5, we have 
P= f Ly [VAsT)]2=22,.1VC/7)] [Vu/7)] + 
+ Lo9[V(u/T)]2} dV. 
The condition for minimum entropy production reads 
sP=6 [Sav=o. 


The variable quantities are the forces X} = V(1/7) and X, = V(u/T). Calcula- 
tion of the variation gives 


[LY OT) —L19V*(u/T)|6(1/T) = 0 
(42.11) 


[L12V2C1/T) -La V*(u/T)] 5(u/T) = 0, 
or 


V: {L1 YA/T) -L12 V(u/T)}= 0, 
(42.12) 
V- {L2 VUIT) —L 79 V(u/T)} = 0. 


Formulae (42.11) and (42.12) express the conditions for stationarity. Thus in 
a stationary state, entropy production has a minimum value. 

We stress that the signs in the conditions for symmetry are essential. Hence 
the proof given is not valid in the presence of a magnetic field. 

Among the other general consequences of Onsager’s theory we mention 
the proof of the existence of a dissipative function for mechanical systems 
undergoing slow motion. 


—— 


$$ 


220 TIME CORRELATION FUNCTION METHOD Ch. 4 


§43. Non-equilibrium processes in many-component systems (diffusion, 
thermodiffusion, thermoelectric effects) 


As another example, let us consider the phenomenon of thermal diffusion. 
In contrast to the preceding example, we now have to consider a many- 
component system whose properties depend on spatial coordinates. We shall 
confine ourselves to the case of chemically non-reacting components, and for 
simplicity of our formulae we shall assume the mixture to be two-component 
and isotropic. If, in the system, there are temperature and concentration 
gradients, then heat and diffusion flows as well as a cross flow, thermal dif- 
fusion, will arise in it. 

We have already considered thermal diffusion in §24 in the case of a mix- 
ture of ideal gases. Comparison of the results of calculations by means of the 
kinetic theory of gases and the theory of irreversible processes is instructive. 

We shall make use of the definitions of forces and fluxes of formulae 
(40.4) and (42.3). 

It should be stressed that, in contrast to the first example, in a system 
having properties varying continuously in space the motion of particles is al- 
ways accompanied by the appearance of mass flow. The total mass flow must 
be equal to zero in a stationary state of the system. 

Let I, and I, be the fluxes of the two components 


T= Ly) X,+Lj2Xt+LigXp, 
I, = Lp) X; + Lo2Xq t+ Lop Xz > 


where the forces X}, X3 and Xç are given by formulae identical with (42.3) 
for each component 


adeh  xp= (4). 


The equality 
I,+1I,=0 
for arbitrary values of forces gives 


Lyy=-Lyy3 Ly2=—L22; Lig=—Loe- 


§43 MANY-COMPONENT SYSTEMS 


Hence 





SEZ 1 
Wis I= Layf TE )+1120(+)- 


V(uy — H2) 
2t {Ciz Lia =H) v(). 


From the basic thermodynamic equality we have 
V(u) —H2) = 


TUA aN AE ED) 
7 br ap Je 


ae, Vc). 





In the presence of mechanical equilibrium Vp = 0. 
Making use of the Gibbs—Duhem relation 


cı du, + c3 du23 =0, 


we finally find 


1 (2%) 
Ct AN ony OE 


l 1 ðH 
=(s,— 2y |—_ a 
(s, — s) T v(t) +a (se) va: 


Then for the flux I, we obtain 





Liz (Pa) Lig- Ly2(hy - h) 
plier ean A = VT. (43.1) 





If we introduce the coefficient of diffusion 


1 OM 


D\2= bo aN oc (43.2) 





222 TIME CORRELATION FUNCTION METHOD Ch. 4 


and the coefficient of thermal diffusion 


_ Lie- Lih- hg) 
T?(1 —c;)c; i 


T (43.3) 


then (43.1) gives 
I, = —Dj Vc, —Dye\(1—c,) VT. (43.4) 


The sign of the coefficients of diffusion and thermal diffusion is defined by 
the requirement that the rise of entropy be positive. The relations obtained 
can be applied to any binary mixtures in the isotropic (liquid or gaseous) 
phase, for any values of the concentration, but for small values of tempera- 
ture and concentration gradients. Similarly one can find expressions for the 
flux in the presence of external forces (for example, when there is ion migra- 
tion in an electric field, viscosity, or chemical reactions between the compo- 
nents). 

Generalization to the case of many-component systems offers no difficulty. 

We stress that in our derivation we have assumed that there is no thermo- 
mechanical effect in a system with continuously varying parameters (i.e. that 
Vp = 0). This constancy of pressure always exists when there is no mean flux 
of matter I, + Iņ = 0 and the system is described by macroscopic laws. Com- 
paring the result obtained with the conclusions of $24, we see clearly that 
their interrelation is the same as that between the results of thermodynamic 
and statistical descriptions. From the kinetic theory of gases the molecular 
meaning of all the quantities was found, but only for the simplest case of an 
ideal gas of small concentration (c; <1). In the thermodynamics of irrevers- 
ible processes one obtains very general relations, but they contain constants 
whose meaning and values remain unknown and which must be determined 
from experimental data. 

As the last and perhaps the least trivial example of the application of the 
thermodynamics of irreversible processes, we cite the theory of thermoelec- 
tric phenomena. Let us consider phenomena arising in a thermocouple made 
of two different metallic conductors whose junctions are maintained at dif- 
ferent temperatures T} and T}. E.m.f. is applied to the circuit, so that there 
are heat flux and electric current in the system. We shall assume that the pro- 
cesses are of a stationary character. We can write for the fluxes the general 


§43 MANY-COMPONENT SYSTEMS 223 
expressions 
if, wits Rea Cea (43.5) 
e Anl T. 12 Ty $ 
Ip= Lz3;ô +e ol (43.6) 
LBP) T 22 Taye oe) 


In an electric field (see §59 of Part III) 
H=Hotey, 


where ọ is the field potential. Therefore (43.5) and (43.6) are conveniently 
written in the form 


Pete eh ts 
[p= = Liye Lip (43.7) 
T- 
1 A o Om 
Igp= -L3 J= La 7? (43.8) 


where ôy is the e.m.f., and 67 = T} — 7. 
Onsager’s relation reads 


Li2= Lz. 


The coefficients in eqs. (43.7) and (43.8) can be expressed in terms of the 
electrical conductivity o and the thermal conductivity A. Namely, setting 
5T = 0 we find 


, 


AAA EO 43.9 
A Toe oo: (43.9) 


so that ø = — L'\,/T. Setting J, = 0, T # 0, we have 


, ôT 
0= 08%- Lin, (43.10) 
p= -Lir E- La gT. (43.11) 


224 TIME CORRELATION FUNCTION METHOD Ch. 4 


Hence 
A= {(L}2)? —L11L22}/L11 7>- 


Consider the particular cases of formulae (43.9) and (43.11). Let the ther- 
mojunction be heated in an open circuit (/, = 0). Then a thermoelectromotive 
force 


Os ah (43.12) 
arises in the circuit. The appearance of a potential difference in the open cir- 
cuit is called the Seebeck effect. When current flows between different con- 
ductors under isothermal conditions (67 = 0), energy transport occurs and a 
certain amount of heat, called the Peltier heat, is released. Setting ôT = 0 in 
(43.9) and (43.11) we obtain 


, 


J 
emer ehs (43.13) 

where II, 5 is the heat released when the current /, = 1 flows. 
Comparing (43.13) and (43.12) and making use of Onsager’s reciprocity 


relation, we find 


bp __ Nia 

ST T. (43.14) 
This formula is called the second Thomson relation. It contains only quanti- 
ties which may be directly measured and is in good agreement with experi- 
ment. It should be noted that for its derivation it is necessary to use the reci- 
procity relation. In the book by de Groot mentioned above it is shown very 
clearly how the implicit use of this relation allowed Thomson to obtain for- 
mula (43.14) from thermodynamic considerations which are inapplicable ex- 
plicitly to the phenomena in question. 

In conclusion, we note that the fact that thermal conductivity is an essen- 

tially positive quantity requires that the inequality L}; L22— (L12)? be ful- 
filled. The proof of this inequality is given in the book by de Groot. 


§44 THE FLUCTUATION-DISSIPATION THEOREM 225 
§44. The fluctuation-dissipation theorem 


In §29 of Part IV we have already discussed the fluctuation-dissipation 
theorem and pointed out that the region of applicability of this theorem is 
much greater than follows from the proof given there. The quantum- 
mechanical derivation of the fluctuation theorem which we are going to 
develop shows that the theorem is applicable for frequencies and tempera- 
tures where quantum effects assume a fundamental importance. 

There are many different ways in which the fluctuation-dissipation theo- 
rem can be proved. We have chosen the simplest and most direct of them*. 

Let a system of particles in a reservoir be acted upon by a dynamic per- 
turbation (an external force) varying in time according to a harmonic law. 

For concreteness we assume that the perturbing force is 


F = eEpei , (44.1) 


where Ep is an uniform electric field. To the perturbation (44.1) there corres- 
ponds a term in the Hamiltonian of the form 


A’ = cEo: reiv!, (44.2) 


It should be stressed that the subsequent proof is valid for any pair of quanti- 
ties in the Hamiltonian for which the perturbation is of the form 


A = ÊF). (44.3) 


Here Ê is the operator corresponding to a certain parameter &. As to the as- 
sumptions that the field is uniform and varies according to a harmonic law, 
one can always expand (44.3) in a Fourier integral and consider the action of 
each harmonic. Hence we do not actually reduce the generality when assum- 
ing a perturbation of the form (44.3). 

The time-dependent perturbation A gives rise to transitions in the system. 
As a result of this, energy is absorbed in the system, being dissipated into 
heat. 

The energy absorbed per unit time, i.e. the absorbed power can be written 
in the form 


Q= (( D RO Wn t D Fionn) ) y (44.4) 


n>m n<m 


` 


* L.D. Landau and E.M. Lifshitz, Statistical physics (Pergamon Press, Oxford, 1958). 


226 TIME CORRELATION FUNCTION METHOD Ch. 4 


The first term represents the absorbed power, and the second, the emitted 
power. The averaging is carried out over the equilibrium Gibbs distribution. 
Since the probabilities of direct transitions are equal to those of inverse 
transitions and are given by formula (56.8) of Part V, we obtain 


Q= & DIA, |H, nm mz [6 (w Ti Opm hnm + (w + Dph] / 


me7E? 
z e4 trae nm [C — Vym) + ôl + mn). 


2h 





The Gibbs statistical averaging gives 


ne E? A 
o= Z { 2 2 exp (— €,,/KT) Iam? nm (H AN nm) E 





t D, 2 exp (— €,,/kT) aml? 2pm 5 (2 =Winn)} - 


n m 


Interchanging the summation indices in the second term, we get 
ne eE? 
Q= DA 2 2 lik Wn d(W — nm) x 
x {exp € €,,/KT) — exp (= €m/kT)} = 
ne?E? 


0 
zey CA D D exp (—€p/KT) Fppg |25(2 — wrm). 
n m 





(44.5) 


Instead of the dissipated power Q, one often finds the imaginary part of the 
dielectric susceptibility, related to the dissipated power by formula (31.38) 
of Part IV 
= im p2 
Ba Oe Eo. (44.6) 
From (44.5) and (44.6) we obtain 
e'M(w) = 55 ara — e~Tivs/kT) DED exp (—€,,/KT) X 


X Iam [8 (w— Wnm) - (44.7) 


§44 THE FLUCTUATION-DISSIPATION THEOREM 227 


We have supplied the imaginary part eM with the argument w in order to 
emphasize its frequency dependence. It turns out that the quantities Emu) 
or Q can be expressed in terms of the time correlation function of the opera- 
tor #, more precisely in terms of the Fourier component of this function. The 
classical definition of the correlation function must be somewhat modified, 
since the values of the operator in the Heisenberg representation, taken at dif- 
ferent instants of time, are in general not commutative and F(t) P(t + 7) 
#Pf(t +T) Ñt). On the other hand, the times ¢ and ¢ + 7 are completely equiv- 
alent. Therefore it is natural to define the correlation function of the time- 
dependent quantum-mechanical operator 7 as the symmetrized product 
At) F(t + 7) averaged after Gibbs, i.e. 


IAN + 7) + P(t + 7) A) = 


=4Tr (Bo, Ad) Re + 7) + At + 7) PCE) = 
= zy Tr {exp (— HIKT), (XAF + 7) + AE +) PCO} . 


The time dependence of the operator is given by formula (49.8) of Part V. 
Hence 


a Wot/(hs iHot/h —iHolt+ V1 o(t+ = 
AONE + TD) yy = CLO Peot iHn Holt, = 
= exp (16,7) 7) aml. 
Thus one can write 


ICCA + 7) + At + 7) P(t) = 


1 i A 
=Z D exp (—€,,/kT) IFnm |2 (exp (iy, 7) + exp (iwmnT)). (44.8) 


In order to relate the correlation function directly to the dissipated power, 
one has to eliminate factors of the type exp (iw,,,,7) in the last sum. 
For this we pass from the correlation function to its Fourier component. 





228 TIME CORRELATION FUNCTION METHOD Ch. 4 


According to the Wiener—Khinchin theorem (see §5 of Part III), we have 


Eo) = 5 S ACW) Ae 1) + He + T) AO) eier dr = 


1 
=z Z expC Ey/KT) Irnpy |? X 


x f exp [i(W,,,, —w)T] + exp [i(w,,,,—w)T] dr = 


1 
Pz 2 exp (— €,/KT) Inm I? [5 (w — Onm) + 5(@—Oy,,)] - 
Interchanging the summation indices in the second term, we finally obtain 
| 
&(W) = YA 2 2 exp (—e,,/KT) (1 + exp [—(E,, —,,)/KT]) X 


l+ e—tw/kT 
x In|? 5 (wo a nm) a 2Z 


x 2 2 exp (= En/kT) im 5(w— nm) z (44.9) 


Comparing (44.7) and (44.9) we see that the following direct relation exists 
between e!™ and g(w): 


5 — e—hiw/kT 
amine Aas 


2n 1 + ew/kT g(w) = 


= Z cotan (22) gw). (44.10) 


Formula (44.10) represents the quantum generalization of Nyquist’s formula. 
In the classical limit 7w < KT it goes over into the classical Nyquist formula. 

We recall that in the case of thermodynamic fluctuations the fluctuation- 
dissipation theorem allows one to relate the spectral density fluctuation with 
the directly measured quantity e!™(w), 





Solid-state Theory 


§45. A solid body as a quantum-mechanical system 


Solid-state theory turns out to be one of those fields of physics where the 
application of quantum-mechanical concepts has proved to be particularly 
fruitful. Only on this basis has it been possible to formulate the principles of 
solid-state theory, to create the theory of the kinetic and equilibrium proper- 
ties of metals, semiconductors and dielectrics, to understand the essential dif- 
ferences between them, and to explain numerous and peculiar phenomena in 
solid bodies which for a long time seemed paradoxical (for example, super- 
conductivity or ferromagnetism). 

At the present time solid-state quantum theory has reached a stage of 
development such that it allows one to predict new, detailed and original 
phenomena in solid bodies. 

At first sight the fact that quantum effects can manifest themselves in 
solid bodies, which are macroscopic objects, appears to be paradoxical. How- 
ever, it should be recalled that every monocrystal represents in essence a 
gigantic molecule (see §49 of Part III). Hence the thermal, electrical and 
other properties of solid bodies are underlaid by quantum effects. We saw this 
in chapter 7 of Part III, where some qualitative foundations of crystal-lattice 
theory were presented. In this book we shall naturally confine ourselves to 
the presentation of the general principles of solid-state quantum theory. 


229 





230 SOLID-STATE THEORY Ch. 5 


For a more detailed acquaintance with the subject, the reader is referred to 
more specialized textbooks*. 

Any macroscopic solid body represents a system of an enormous number 
of particles interacting strongly with each other. It is therefore clear that for 
the construction of solid-state theory it is necessary not only to combine 
quantum-mechanical and statistical descriptions, but also essentially to simpli- 
fy and schematize the picture of the interactions existing between the parti- 
cles of solid-body. This means that it is necessary to create a solid-body model 
which is sufficiently adequate for the object, yields its basic features but does 
not take into account minor and inessential details. 

Our further considerations will be based on the following model: a solid 
body represents a set of ions and valence electrons. By ions we understand 
atomic nuclei together with all the electrons in closed shells. The interaction 
of the electrons of the closed shells with the nucleus is so strong that the 
proximity of other atoms and the crystal form have no significant effect on it. 

We shall abstract the internal structure of ions and assume that the elec- 
trons of the closed shells of each ion interact only with their own nucleus. 
We shall consider ions to be point particles possessing masses M (identical in 
the case of elementary substances and different in the case of chemical com- 
pounds). 

It is clear, however, that when ions draw together to distances of the order 
of their own size, the valence electrons of a given atom enter into a strong in- 
teraction with neighbouring nuclei and their electron shells. This interaction 
ensures the chemical binding between ions, which in the case of crystals is 
called the cohesive force. Hence valence electrons cannot be considered to be 
localized in a given atom, and in certain cases they may move throughout the 
crystal. The presence of valence electrons is not a necessity for a solid body. 
For example, they are absent in crystals of elements of the zeroth group of 
the periodic table. In such crystals the binding between the atoms forming 
the lattice arises from the van der Waals force. In covalent crystals, there are 
not only ions at lattice points but also neutral atoms. Therefore, instead of 
speaking of ions, one frequently speaks of nuclei at crystal lattice points. 
However, in the overwhelming majority of phenomena occurring in solid 
bodies, electrons play a fundamental role. Therefore we shall consider the 
most general case where the crystal contains ions and valence electrons. 


* We particularly recommend: J.M.Ziman, Principles of the theory of solids (Cam- 
bridge University Press, Cambridge, 1964); R.Peierls, Quantum theory of solids (Claren- 
don Press, Oxford, 1955); C.Kittel, Quantum theory of solids (Wiley, New York, 1963). 


§45s A SOLID BODY AS A QUANTUM-MECHANICAL SYSTEM 231 


Let R; denote the coordinates of the ions (nuclei), and r, the coordinates 
of the valence electrons. Then the Hamiltonian of the system of ions and elec- 
trons can be written in the form 








? 2 2 
= 2 Ev? DS IOD 


3 + 
i 2M k 2m kå ij lr;— rl 


+ 27 UR;-R) + 27 UR R). (45.1) 


i>j 


The first sum (over all the ions) represents the kinetic energy of the ions. The 
second sum gives the kinetic energy of all the valence electrons. The three 
last terms describe respectively the Coulomb interaction between the elec- 
trons, the interaction between the ions, and the interaction between the ions 
and the electrons. These interactions depend on the distance between the 
corresponding particles. In the first two interaction sums the particle indices 
must be different, while in the third interaction sum they may be the same 
and the factor 4 is absent. 

Solid-state theory is based on the assumption that it is possible to use 
adiabatic perturbation theory. The set of valence electrons is considered to be 
a fast subsystem, and the set of nuclei and associated closed shell electrons a 
slow subsystem. In this approximation the total wave function of the system, 
Pari ra; Ry, R3, ...), can be written in the form 


Efri ra «3 Ry, Ro, ...) = 
= Y(R], Ro, ny) Ry) Falt}, TQ, 5 R}, R3, ER Ry) 5 (45.2) 
where W; is the wave function of the system of ions, and WV, is the wave func- 


tion of the system of electrons. In the adiabatic approximation these wave 
functions satisfy the equations (see (57.7) and (57.8) of Part V) 


a ive ——— jt 2 Ue Rp] vaz 


ama EA ei 


= E.(R,, R3, Perry Ry) Wo > (45.3) 


2 
[- L D y2+ D UR-R) + £A(Ry, Ry, --» Ry)) =. (45.4) 
Bu S) 





232 SOLID-STATE THEORY Ch. 5 


In this approximation the equation for the wave function of the fast subsys- 
tem does not involve the momenta (derivatives with respect to coordinates) 
of particles of the slow subsystem. This means that the electrons move relative 
to given positions of the heavy ions. To any change of the latter there corres- 
ponds a redistribution of electrons which always manage to readjust them- 
selves to the changed arrangement of ions. Hence the potential energy of in- 
teraction between the electrons and ions involves the coordinates of the ions 
only as parameters. The energy of the valence electrons is in its turn a con- 
stituent part of the potential energy of the system of ions. 

The application of the adiabatic approximation to the treatment of solids 
has been recently substantiated and the limits of its applicability determined. 

In the adiabatic approximation, one succeeds to a certain degree in separ- 
ating the motion of the electrons and ions. Knowing the energy of interaction 
of the electrons and ions and considering the positions of the ions to be given, 
one can in principle solve eq. (43.3) and find the energy £., of the system of 
electrons which depends, of course, on the positions of the ions. This value 
of E must then be used in solving eq. (45.4). 

It is clear, however, that the actual carrying out of this program is asso- 
ciated with insuperable difficulties, since each of eqs. (45.3) and (45.4) con- 
tains a macroscopically large number of variables. Therefore modern solid- 
state theory is associated with the consideration of simplified models, reflect- 
ing the most important features of their behaviour. 


§46. The crystal lattice 


We begin with the discussion of the properties of the slow subsystem, i.e. 
with the properties of the system of ions. Eq. (45.4) completely defines the 
behaviour of this system. Its solution must, first of all, give an answer to the 
question as to how a solid body, i.e. a body with a regular arrangement of 
heavy particles at crystal lattice points, is formed. It is clear, however, that 
the exact solution of this equation is out of question. Hence the problem of 
calculating the cohesive forces keeping the crystal together, and of elucidating 
the role played in them by van der Waals and exchange forces as well as by 
the valence electrons existing in the system, is the most difficult of solid- 
state theory. It cannot as yet be considered solved, although very important 
relevant results have been obtained. 

We cannot discuss this problem here, and shall confine ourselves only to 
qualitative considerations. It is clear, first of all, that the interaction between 
ions (or atoms in the case of coyalent crystals) ensures, in principle, the exis- 








§46 THE CRYSTAL LATTICE 233 


tence of a crystal structure. To the minimum energy of the system there cor- 
responds their regular arrangement in space at certain distances from one an- 
other. In the case of covalent crystals an exchange interaction, which gives 
rise to attraction at large distances and very strong repulsion at small dis- 
tances, takes place between atoms. In the case of ionic crystals, the attraction 
is of an electrostatic character, whereas the repulsion is due to a complex in- 
teraction between atomic residues which arise at small distances. In metals, a 
substantial contribution to the cohesive forces is given by valence electrons 
which considerably weaken the repulsion between ions. As a result of the in- 
teraction, atoms or ions are distributed at distances corresponding to the min- 
imum energy. These distances are similar to those between atoms in mole- 
cules (1—2A). Correspondingly, the cohesive energy per atom is of the same 
order of magnitude as the energy of chemical binding, and varies from 1 eV 
(for metals) up to 10 eV (for ionic crystals of the sodium chloride type) per 
particle. 

The regular arrangement of ions or atoms at crystal lattice points plays a 
very important role and determines many physical properties of solids. There- 
fore in what follows we shall need the elements of crystal lattice geometry. 
The basic geometric property of a crystal lattice is its translational symmetry. 
We shall neglect the finite size of the crystal and shall assume that it fills all 
space. The property of translational symmetry allows one to introduce the 
three basis vectors a, b, c. When the crystal translates in space by the vector 


n=natnobtnyxc, (46.1) 


where n}, ng and n3 are integers, it coincides with itself (here it is assumed 
that the origin of the vector n coincides with one of the lattice points). 

The basis vectors define the main crystallographic directions in which 
translations can be carried out. This can be formulated also in a somewhat 
different way. From the three basis vectors a,b,c we construct a paral- 
lelepiped. This parallelepiped is called a unit cell. Then the entire crystal of 
infinite size is obtained by infinite reproduction of unit cells. It should be 
noted that the choice of basis vectors is not itself quite unambiguous. This is 
most easily seen in fig. VI.6 in which a simple cubic lattice is shown. One can 
choose as basis vectors the three vectors a, b, c or the three vectors a’, b’, c’. 

We see that if the atoms of the lattice are of one type, then all the lattice 
points are obtained by the translation of one lattice point by the vector n for 
different values of the integers 74, n3 and n3. In this case the unit cell is said 
to contain one atom (at a lattice point). Such a lattice is called a Bravais 
lattice. 





234 SOLID-STATE THEORY Ch. 5 





< 
































Fig. V1.6 


If the unit cell contains two or more atoms, then the crystal is said to have 
a lattice with a basis. A lattice with a basis represents a set of simple lattices 
pushed against each other. For example, a unit cell of the sodium chloride 
crystal contains a sodium atom and a chlorine atom. The sodium and chlorine 
atoms form simple cubic lattices shifted with respect to each other by one 
half of the cube edge. The magnitudes of the basis vectors are called lattice 
constants. The lattice constant represents the distance (in the given main 
crystallographic direction) between nearest neighbours. 

We shall not dwell on the discussion of the possible symmetry elements 
and crystal classes. We only point out that translation transformations are not 
the only symmetry transformations which bring the crystal into correspond- 
ence with itself. Other such symmetry transformations are rotations and 
reflections. 

In what follows we shall frequently encounter the concept of the reciproc- 
al lattice. The reciprocal lattice is a lattice constructed on the basis vectors 
aj, b} and c} defined by the formulae 


(aX b) 


(bX c) 
a- (bX c)` 


9 (cX a) 
a- (bX c)’ 


a, = 20 Wiis Zi rar sera Cc, =20 (46.2) 


§47 LATTICE VIBRATIONS 


i) 
we 
wn 


The reciprocal lattice vector K is connected with the basis vectors a,, b; and 
c, by the relation 


K=a,m, + bim +cmz, (46.3) 


where m4, M, and 713 are integers. 
It is obvious that 


K-a=20m,, K-b=27m,, K-c=27m3. (46.4) 


The importance of the reciprocal lattice vector is associated with the fact that 
the following equality always holds 


K- n= 2n(nym, + nym, +n3m3) = 27N , (46.5) 


where N is an integer. 
Because of this, the frequently encountered expression e!K-r reduces to 
unity for r =n, i.e. when the vector r coincides with one of the lattice points, 


eiK-n = ei2rN a] | (46.6) 


§47. Lattice vibrations 


Ions or atoms located at crystal lattice points are in thermal motion and 
vibrate about equilibrium positions. In solid crystals at temperatures below 
the melting point the amplitude of these vibrations is small in comparison 
with the lattice constant. 

We denote the position of the lattice point (equilibrium position) of the 
nth ion by n, and the displacement of the nth atom by & p, so that 


R,=n+&,. (47.1) 


Then proceeding in exactly the same way as we did in §50 of Part III in 
the classical approximation, we can expand the potential energy of interac- 
tion U(R;— R) in a series in powers of small displacements. We saw in §50 
of Part III that the character of the motion of a lattice depends essentially on 
its structure. The presence alone of a cell with a basis (for example, the 
presence of two kinds of particles having different masses) gives rise to new 
vibrational modes. 





236 SOLID-STATE THEORY Ch. 5 


Since we are in the first place interested in the fundamental theoretical as- 
pects of the subject, we shall not consider complicated calculations and shall 
restrict ourselves to crystals with a simple Bravais lattice. Moreover, for sim- 
plicity of notation we shall drop the vector indices, as if the crystal were a 
one-dimensional chain. Then near the equilibrium positions one can write 
that 


ai a2U 
UR, — Ry) = Un—-1n)+ 5 2 IRR, nEn t 


1 a2U 
6 nnn" ORAR pR p" 








+ Sinan ar (47.2) 


Retaining only the first two terms of the expansion, the Hamiltonian can be 
written in the form 


= 2 
ae Ho 2 2 Aa T 2 ann Enén >, (47.3) 


, 


n,n 


where Hp involves all terms which do not depend on displacements. 
The equation of motion reads 


iA PA Mën EF. 2 Any! En! - (47.4) 


The solutions of the equation of motion which satisfy the translational sym- 
metry conditions are the functions 


En 2 gre EEA (47.5) 


Substituting (47.5) into (47.4) gives 


Mo = 2 auc aD, (47.6) 


We recall that the values of the wave number are determined from boundary 
conditions. If conditions of periodicity over the length L = Na are chosen to 


§47 LATTICE VIBRATIONS 237 


be the boundary conditions, then 
f= aN (m= 1, 2,...,.N). (47.7) 


The wave number f in the lattice is not determined uniquely but to within 
the quantity 27/a; that is the displacement does not change under the replace- 
ment f>f+2r/a. We shall normalize the wave numbers in the interval 


— nja Sf< nja. (47.8) 
In a one-dimensional chain the reciprocal lattice vector is equal to 27/a. With- 


out reproducing the calculations of §50 of Part III we pass to normal coordi- 
nates, introducing the variables Pr and af by the relations 


1 
ae (4 I > eif pp, (47.9) 
E= T 27 eiqy. (47.10) 


Since p, and £,, are real quantities, the following equalities hold: 


apia — CEG (47.11) 


Substituting (47.9) and (47.10) into (47.3) and taking into account (47.6), 
we find 


H=H,+3 2 (p7 + 747) - (47.12) 


The second term of (47.12) represents the Hamiltonian of a system of inde- 
pendent oscillators with frequencies wp. Transition to the quantum Hamil- 
tonian is carried out, as usually, by replacing pand q-by operators satisfying 
the commutation relations (26.2)—(26.4) of Part V. 

Comparing expressions (47.5) and (47.12) with the corresponding formu- 
lae for the electromagnetic field operators (see §100 of Part V) we satisfy 
ourselves of their formal identity. 

As in the case of ue electromagnetic field, it is convenient to introduce 
the creation operator Bi. and the annihilation operator b satisfying the com- 





238 SOLID-STATE THEORY Ch. 5 


mutation relations (101.3) of Part V. Making use of the results of §101 of 
Part V, we can immediately write the quantized lattice energy 


E= D Byby+ Awy= 2 (nyt Dhor. (47.13) 


Just as in the case of the electromagnetic field, a system of quantized waves 
can be looked upon as a system of independent quantum particles (bosons) 
called phonons. Formula (47.13) shows that the energy of a phonon with 
wave number f is equal to hay. 

In passing to a three-dimensional crystal the general situation becomes, in 
essence, only a little more complex. The displacement & „ can be written in 
the form of the vector 


3 
2 erja efTe s (47.14) 


g 
f 


(MN): 


where ep; are the three polarization vectors: the longitudinal polarization 
vector ep) directed parallel to the vector f, and the transverse polarization 
vectors efo and ep3 directed perpendicular to the vector f. From the fact 
that & p is real it follows that qi= q_f;- The wave vector f is defined to within 
the reciprocal lattice vector K. 

The Hamiltonian of a three-dimensional lattice is of the form 


H=} 2 (pijprj + ofdie) « (47.15) 


For a given value of f there are three phonons with frequencies Wry, py and 
O73 

In contrast to photons, for which there is a linear relation between the 
frequency « and the wave number f, for phonons there is always a complex 


dispersion law. 
For a linear chain the dispersion law is defined by formula (47.6); for a 
three-dimensional crystal there are different dispersion laws for different po- 


larizations, i.e. wp; = w(f,/). 
Only for small values of the wave vectors in crystals with cubic symmetry 
can one set 


w= We = Cf, (47.6') 


§47 LATTICE VIBRATIONS 239 


To carry out the quantization it is convenient to introduce the phonon 
creation operator bi jand the phonon annihilation operator By. setting 


Avech falas chy teu ey (lll Na 
ej = (a bgs 4y = ce bij. (47.16) 


The commutation rules for the phonon creation and annihilation operators 
(101.3) of Part V must be written in the form 


bejbi — bh ej = Bpg°5 jy" » (47.17) 
Then 


E= 2 2 (i, be; + Dhog 2 2 (np; + }) hus; . (47.18) 
J 


Generalization of the theory to the case of lattices with a basis does not in- 
troduce any essentially new points. 

In the most general case it can be stated that the thermal excitation of the 
lattice is described by a system of elementary independent particles — pho- 
nons. Thus the description of collective excitations (waves) in a crystal lattice 
is very similar to that of the field of electromagnetic radiation in a cavity. 
However, it is necessary to stress that this analogy is to a certain degree for- 
mal. Whereas photons are as real as any other particles: electrons, mesons or 
protons, phonons are a fictitious, formally introduced concept. As a matter 
of fact the Hamiltonian in the form (47.3), allowing the introduction of nor- 
mal coordinates and the transformation into the form (47.12), is not exact. It 
was obtained as a result of disregarding terms of the third and higher order of 
small quantities €,, in formula (47.2), i.e. in a harmonic approximation. Tak- 
ing into account anharmonicity makes it impossible to bring the Hamiltonian 
to the form (47.3), i.e. to a sum of quadratic terms. Hence the possibility it- 
self of introducing the concept of phonons is closely associated with the ap- 
proximate consideration of the thermal motion of a crystal lattice. In contrast 
to true Bose particles (photons), phonons are said to be quasiparticles. The 
difference between quasiparticles (phonons) and true particles (photons) 
manifests itself particularly clearly by the fact that phonons cannot be as- 
signed a definite value of momentum. Indeed, the momentum of a free parti- 
cle, as well as its energy, runs over a continuous, in no way bounded sequence 
of values. The energy and momentum are conserved in any interaction. For 
the phonon, the role of momentum is formally played by the quantity Af. 


— 





240 SOLID-STATE THEORY Ch. 5 


However, the value of this quantity lies in the definite interval (47.8). More- 
over, the value of the vector Af is not determined unambiguously but only to 
within the quantity AK, where K is the reciprocal lattice vector. Therefore the 
vector Af is said to be a quasi-momentum. In contrast to true momentum, the 
value of the quasi-momentum need not be conserved in interactions of the 
phonon with other phonons or electrons. The crystal lattice can, in an arbi- 
trary way, give to or take away from the phonon the quasi-momentum AK. 

In spite of the fictitious character of the quasiparticles (phonons) their in- 
troduction has turned out to be very fruitful. It represents the basis for the 
consideration of all processes occurring in solids. Moreover, the introduction 
of elementary excitations as certain quasiparticles is an approach typical of 
modern physics in describing excited states in systems with a large number of 
particles. 

In a number of problems, the excitation corresponding to the collective 
motion of a macroscopic system can be described by means of certain coor- 
dinates X; so that the total Hamiltonian of the system, H, turns out to be 
equal to 


Å= D #,, where H,= aP? + f(X)) a 
Then the energy of the system is also written in the form of a sum 
E= 2 E,. 


It very often turns out to be possible to find variables for which the Hamil- 
tonian H; has a form similar to that of the Hamiltonian for an oscillator. It is 
then said that the collective motion of the system can be decomposed into a 
set of elementary excitations. In this case the eigenvalues of the operator A; 
corresponding to the ith degree of freedom of the collective motion can be 
looked upon as the energy of a certain quasiparticle. The case discussed above, 
of the lattice vibrations of a crystal as a whole leads to one of the forms of 
the quasiparticles; phonons. Each quasiparticle corresponds to one degree of 
freedom of the collective motion of the system. The introduction of quasi- 
particles has the advantage of allowing the motion of a system of real inter- 
acting particles to be compared to a system of non-interacting or weakly in- 
teracting quasiparticles. 

The energy spectrum of the set of quasiparticles is the same as that of the 
real system. The possibility of introducing quasiparticles occurred in a num- 
ber of problems associated with the theory of systems of many particles and 
mainly in solid-state theory. Hence, at present there is in theoretical physics a 


§48 WAVE FUNCTION OF AN ELECTRON 241 


multitude of quasiparticles, for example polarons (electrons in polar crystals 
surrounded by a ‘cloud’ of phonons), excitons (electron-hole pairs formed in 
a semiconductor), magnons (elementary excitations in ferromagnetic solids), 
plasmons etc. These quasiparticles are not truly existing particles. However, 
their formal introduction reflects the character of the processes occurring in 
many-particle systems, and allows one to make use of a convenient and well 
elaborated mathematical apparatus. 

The concept of phonons is basic in considering the motion of electrons in 
a crystal lattice. We shall see that the interaction of the electrons with the 
vibrating lattice and their scattering on the atoms of the lattice gives rise to 
electrical resistance. This interaction, as will be seen from the calculations, 
can formally be considered as a result of the collisions between electrons and 
phonons which, as a gas, fill the volume of the solid body. These collisions 
obey the laws of conservation of momentum and energy in the system of 
electrons and phonons. 

For the concept of quasiparticles to make sense, it is necessary that their 
lifetime be sufficiently large. If € is the mean energy of a quasiparticle, and 
Ae the uncertainty in its energy, then the following inequality must be ful- 
filled 


E> Ae~ h/t, (47.19) 


where 7 is the lifetime of the quasiparticle. 

The processes determining the lifetime of a quasiparticle can be any pro- 
cesses of absorption and scattering. In the case of phonons, the origins of ab- 
sorption and scattering can be impurities in the crystal and the anharmonicity 
corresponding to collisions between phonons. 

In a pure crystal the number of impurities is small. At low temperatures 
the anharmonicity is also small. Hence at low teniperatures sufficiently far 
from the melting point the lifetime of phonons turns out to be very large and 
the concept of phonons as quasiparticles makes sense. 


§48. Wave function of an electron moving in a periodic field 


Before studying the motion of a system of electrons in an external field we 
shall consider the motion of an electron in the periodic field of a crystal lat- 
tice. In succeeding sections it will be shown how this solution can be used for 
investigating the entire system of electrons. 

For an electron moving in a crystal lattice the Schrödinger equation is of 





242 SOLID-STATE THEORY Chas 


the form 


h2 
[- aa u] Ve = En Wer - (48.1) 


In this equation the potential energy U is a periodic function with a period 
equal to the crystal lattice spacing. It satisfies the periodicity condition 


U(r +n)= U(r), (48.2) 


where the vector n is defined by formula (46.1). We introduce the translation 
operator 7, defined in the following way: 


T, V(r) = vr +n). (48.3) 


Since the potential energy is a periodic function, it is obvious that the opera- 
tor i commutes with the Hamiltonian Ê = —(h?/2m) V? + U. Hence it fol- 
lows that the wave functions of stationary states can be chosen in such a way 
as to be the eigenfunctions of the two operators H and T,. This means that 


the following equations must hold 


T, W(t) = antl), 
(48.4) 


Å W(t) = EnV). 


Here a, is an eigenvalue of the operator T, . It is easily shown that the values 
of a, are in modulus equal to one. Indeed, the probabilities of finding the 
electron at points r and r+n must be the same by virtue of the periodic prop- 
erties of the lattice, i.e. 


IW(r + n)|? = |v). 
Using (48.4), we obtain 
la, V(r)? = Wr? , 


which proves our statement. 
Without restricting the generality, the coefficient a, can be written in the 


form 


a, = eik-n (48.5) 


§48 WAVE FUNCTION OF AN ELECTRON 243 


For reasons which will become clear from what follows, k is said to be the 
effective wave vector of an electron moving in a periodic field. Below we 
shall for brevity frequently call k the wave vector. af 

Consequently, for the eigenfunctions of the operators H and Ts; in cor- 
respondence with (48.4), the following relation must hold: 


y(r +n) =eik-n y(r). (48.6) 


In the most general case, the function satisfying condition (48.6) can be 
written in the form 


Wir) =e Tu (r), (48.7) 
where u(r) is a periodic function with a period equal to the lattice period 
u(r) = u(r +n). (48.8) 


Substituting the wave function y defined by (48.7) into the Schrödinger 
equation, we find the equation for u,(r) 


2 
= x ( V+ik)? +(U—E)} u(r)=0. (48.9) 


Eq. (48.9) shows that, besides other quantum numbers, the energy of the 
electron also depends on the wave vector k. 

If we wanted to take into account surface effects, we should require that 
the wave function decrease outside the crystal. At large distances from the 
crystal it must tend to zero. However, we shall not take up a study of sur- 
face effects here, but replace the boundary condition mentioned by a peri- 
odicity condition. For a cubic crystal with edge L it is of the form ekil = 1, 
This corresponds to the condition w(x = 0) = W(x = L) in the cyclic planes. 
Hence it follows that the vector components k; take on values k; = (27/L) l;, 
where /; runs over a sequence of integer values. 

However, taking into account that the dimensions L are very large, one 
can also approximately assume k to be a continuously varying quantity under 
the requirement of cyclicity. Another feature of the vector k is the ambig- 
uity of its definition. If the reciprocal lattice vector K is added to the vector 
k, then in formula (48.6), defining k, one can write 


Wr + n) = elfKtK)-n y(r) = elk n y(r) , 





244 SOLID-STATE THEORY Ch. 5 


since, in correspondence with (48.2), e'K-" = 1. Thus the replacement of k by 
k+K has no effect on the definition of this vector. The wave vector K is said 
to be defined to within the reciprocal lattice vector K. 

Formula (48.7) shows that the wave function of an electron moving in a 
periodic field is of the form of a modulated plane wave, i.e. a plane wave with 
amplitude u,(r) varying in space. Thus there is an analogy between the wave 
function of a particle in a periodic field and the wave function of a free par- 
ticle. This analogy justifies the term effective wave vector k. At the same 
time it should be stressed that an electron moving in a periodic field is a 
bound and not a free particle. Its momentum has no definite constant value. 

We shall frequently normalize W(r) by the condition 


fiv@eav=n, (48.10) 
G 


where the integral is taken over the so-called basic region of the crystal, i.e. 
over the region bounded by the nearest cyclic planes. V is the number of 
particles in the basic region of the crystal. If the wave function is written in 
the form 


y=- cik-ru,(r), (48.11) 
a 


then (48.10) goes over into the following normalization condition: 


fiui2av=1. (48.12) 
L 
The integration in this formula is carried out over a unit cell. 


§49. The energy spectrum of an electron moving in a periodic field 


Expression (48.7) for the wave function is of a general character. In order 
to get some idea of the concrete form of the function uw, and about the ener- 
gy spectrum E(k) of an electron in a crystal lattice, it is necessary to make 
certain assumptions about the character of the potential energy of an electron 
in a crystal. We shall assume that the wave function and the energy spectrum 
of an electron in a certain isolated jth atom are known, i.e. that the solution 


§49 ENERGY SPECTRUM OF AN ELECTRON 245 
of the following equation is known: 
` h? V2+(U.-E = 
FA ( ;—E(n)| o=0, (49.1) 


where U; is the potential energy of an electron in the jth atom, n is the quan- 
tum number, and £(7) are the eigenvalues of the energy of the electron in the 
atom. If a crystal lattice is formed of atoms of a given type, then owing to the 
action of the fields of neighbouring atoms on the electron its wave function 
and energy levels will change. We shall, however, assume that this action is 
sufficiently weak, so that (1) the action of only the nearest neighbours need 
be taken into account, (2) the shift of energy levels due to the action of 
neighbouring atoms is small in comparison with the spacings between neigh- 
bouring energy levels of an isolated atom. 

The solution of the Schrodinger equation for an electron in a crystal lat- 
tice obtained on the basis of these assumptions is said to be the solution in 
the tight binding approximation. 

To characterize an electron in a crystal lattice consisting of M atoms in the 
zeroth order approximation, i.e. without taking into account the action of 
neighbours, one can write N wave functions satisfying the Schrödinger equa- 
tion (49.1) for one and the same value of the energy. Therefore the state of 
the electron is N-fold degenerate (roughly speaking, the electron can be local- 
ized at any of the NV atoms). The interaction of the electron with the neigh- 
bouring atoms removes this degeneracy. 

As is done in the perturbation theory of degenerate states, we seek the 
wave function in the zeroth order approximation in the form 


v= © Gj, (49.2) 
J 


where C; are constants, and yj is the solution of eq. (49.1) for the jth atom. 
The summation is carried out over all atoms of the crystal. We understand j 
to be the radius vector going from the origin to the nucleus of the jth atom. 
To simplify the calculations we shall assume that all the electrons are in 
the non-degenerate s-state. 
We shall denote the energy of an electron in an atom by £(s). Then the 
energy of an electron in the crystal will be of the form 


E(k, s) = E(s) + E(k, s). (49.3) 


It is different from the energy of an electron in an atom and depends on the 





246 SOLID-STATE THEORY Ch. $ 


wave vector k. Substituting the wave function (49.2) and the energy (49.3) 
into the Schrödinger equation for an electron moving in a crystal, we have 


D Cv; +2 [ko + eOk, 9- D ui) 2) Cy.=0. (49.4) 
j ñ2 1 Tice, 


Here we have written the potential energy U of an electron in a crystal lattice 
in the form of the sum 2,U,(r), where U,(r) is the potential energy of an elec- 
tron in the field of the Ith atom and the summation is carried out over all 
atoms of the lattice. 

Since the potential energy in eq. (49.4) cannot be divided into an unper- 
turbed part and a perturbation, the solution cannot be carried out by means 
of ordinary perturbation theory. Our purpose is to find an approximate solu- 
tion of eq. (49.4). The electron in an unperturbed state is considered to be 
localized at the jth atom. The interaction of the electron with the atoms of 
the lattice which are nearest to the jth atom is a small perturbation. 

Let us write the major term of the potential energy of the electron, the 
energy of interaction with the jth atom, in the form 


DYDD Ulery + Ufo) = U'+ Yr), 


where the prime denotes that the term with I = j is omitted in the sum. Then 
we can write 


2m i 
D a Ao, 3 2 “9. = 
2c; v*o,t n2 [E(s) ule) + 2 (eK, 2 u,) Gyj=0. 


By virtue of (49.1) the bracket reduces to zero, and we arrive at the equation 


27 Gi[E MK, 8) —U'] 95=0. (49.5) 


By virtue of our assumption of the weak action of neighbouring atoms on 
the electron ‘localized’ at the jth atom, it can be assumed approximately that 
the wave functions y do not overlap, i.e. we can set 


f 1 for j=h, 
fae qdV= 3 
Joh O for j#h. 


§49 ENERGY SPECTRUM OF AN ELECTRON 247 


Multiplying (49.5) by Lh and integrating over the volume of the crystal, 
we have 


a ~ (1), r = 
D fenGe (k,s)—U')g,dV=0. 


Carrying out the integration, we obtain 


Cy lEO(k, s) +a] + 27 BG—h)C)=0, (49.6) 
j#h 


where the following notation is introduced: 
a=—[y,U'o,aV, 


3 (49.7) 
6G —h) = — fy, U'e, av. 
The integrals involved in æ and 6 depend only on the difference between j and 
h but not on these quantities taken separately. Taking into account the small 
degree of overlap of the wave functions, we can confine ourselves in the sum 
of quantities B(j—h) to the terms corresponding to neighbouring atoms 
which are nearest to the hth atoms 


26 —W) Cj = 22 Bn) Caen 
ny 


In this sum the number of terms is equal to the number of nearest neigh- 
bours of the hth atom. In the particular case of a cubic lattice, the number 
of nearest neighbours is 6. Furthermore, it is clear that in this case B(n,) = 
(=n; ). 

Giving h all its possible values, we thus arrive at an infinite set of linear 
algebraic equations: 


(E(\(k, s) + a) Gp 2 B(N) Chan = 0. (49.8) 


We shall try to find the solution of this system in the form 
Cp = eik-h , (49.9) 


Substituting (49.9) into (49.8) and taking into account that an atom in a 





248 SOLID-STATE THEORY Ch. 5 
cubic lattice has 6 nearest neighbours, we obtain 


E\\k, s)=—a— 2) p(n) eik-n = 
n 


= —a— 2B(cos ka + cosk,a + cosk,a) . (49.10) 
Correspondingly, the wave function is of the form 
y= D Gyj= D eiki gj =eikr 2) EKEN; k 
For pj we can by definition write 
9 (1) =o (r —j), 


where the index s denotes that we consider an electron in the s-state. For the 
wave function y we have 


y =eik-r Dek -Py(r—j). (49.11) 


Comparing with (48.7), we obtain 


u(r) = 4 e-ik-t-j)p(r—j). (49.12) 


The function uy possesses the needed properties of periodicity. 

We now pass to a discussion of the expression for the electron energy de- 
fined by the quantities œ and $. We note, first of all, that the wave function 
y, of an electron localized at a given atom j has nodes only inside the atom. 
Outside the atom it decreases monotonically. We shall further show that the 
quantity U’ involved in the expressions for œ and 6 is negative. 

Indeed, since the potential energy of an electron in a crystal lattice U(r) 
and in an isolated atom U; is negative (the electron is bound in the atom), one 
can write 


U' = U(r) — Uj = —|U(r)| + 1U) . 


It is clear that the addition to the isolated atom of other atoms of the same 
nature (attracting the electron) leads to an additional increase in the binding 
of the electron in the system. Hence |U(r)| > 1Ujl, which proves our state- 


§49 ENERGY SPECTRUM OF AN ELECTRON 249 


ment. From the monotonic character of y, and y; and the negative sign of 
the quantity U’ it follows that the two integrals œ and 6 defined by formulae 
(49.7) are essentially positive. 

So, the addition to the energy of the s-state turns out to be negative; the 
electron is more strongly bound in the system of atoms than in the isolated 
atom. For small values of k one can expand coska in a series and write 
E(k, s) = E(s) — a — 68 + Ba?k? = E min + p?/2m™ = E min + A?k?/2m* , (49.13) 
where Emin is a constant quantity independent of k, corresponding to a mini- 
mum value of energy 


Emin = E(s) — 0 — 68 , 
and m’” is equal to 


2 2 
m= L == ose (49.14) 
2ßa? a2E/ak? 





Formula (49.13) shows that for small k the dependence of the electron 
energy on k is of the same form as for a free particle. Here the role of mass is 
played by the effective mass m*, and the quantity p = Ak is naturally called 
the momentum of the particle. However, it should be kept in mind that p is 
not a real but an effective momentum. This is seen from the fact that the 
vector k and along with it the vector p are defined only to within the recip- 
rocal lattice vector. Therefore the quantity p is said to be a quasi-momentum. 
Just as the momentum conservation law holds for a free particle, the value of 
quasi-momentum is conserved for a bound particle in the periodic field of the 
crystal. 

For large values of k the expansion (49.13) loses its validity. When the 
following equalities are fulfilled 


k,a=T, kya=T, k,a En, 


z 


cos kķ;a are equal to — 1 and the energy of the electron takes on a maximum 


value 
e( 


> 


AIA 
RIA 


T 
2 zis) = Emax 


ee 


250 SOLID-STATE THEORY Ch. 5 











le 
AE 
O k, 
Fig. VL7 


The behaviour of the energy Z£(k,s) is shown schematically in fig. VI.7. 
Since E(k,s) depends on three variables, fig. VI.7 shows the dependence of 
the energy on ky for fixed values of k, and k,. 

We see that from the discrete energy level corresponding to the s-state of 
the electron in the atom a band of allowed energy values arises in the crystal. 
It should be stressed that the formation of the band of allowed energies is 
due to the interaction of the electron ‘localized’ at a given atom with the 
field of other atoms of the lattice. This interaction removes the degeneracy 
which exists in a system of N atoms and an electron. Since the electron can 
be ‘localized’ at any of the atoms, the multiplicity of the degeneracy is very 
large and the degenerate energy level, owing to the interaction mentioned, 
splits into a large number of sublevels. In that approximation in which the 
wave vector can be assumed to run over a continuous sequence of values (see 
§48), the set of sublevels can be considered as a continuous band of allowed 
energy values. 

The allowed energy band arising from the ground energy level of the atom 
is called the lower energy band. The width of the lower energy band is equal 
to AE = E max —E min = 126. Exactly the same calculation can be carried out 
for states arising from the first excited state of an electron in an isolated 
atom. We shall assume it to be a p-state. The energy of the electron E(k, p) is 
then given by a formula of the same type as for E(k, s). However, it turns 
out* that one of the overlap integrals (integrals (49.7)) is negative, while the 
other two are positive. 

Thus the first excited energy band of an electron moving in a crystal lat- 


*See A.H.Wilson, The theory of metals (Cambridge University Press, Cambridge, 
1953). 


§49 ENERGY SPECTRUM OF AN ELECTRON 251 


tice arises from the energy level corresponding to the p-state in an isolated 
atom. Subsequent energy levels of an isolated atom turn into corresponding 
bands in the crystal. 

In complete accordance with assumption (2) (at the beginning of the sec- 
tion), the condition for qualitative applicability of the calculation given is the 
requirement that the width of the bands be small in comparison with the 
spacing between energy levels in the unperturbed atom. Since the width of 
bands is defined by integrals of the type (49.7), this requirement is equivalent 
to small overlap of the wave functions of the electron in neighbouring atoms 


he 


Willa NI 
Willa NİZ 


LLL 


o kx 











Fig. VI.8 


The energy spectrum in a crystal in the tight binding approximation has 
the character of relatively narrow bands of allowed energies separated by 
wide bands of forbidden energies (fig. VI.8). This approximation holds, in 
fact, for relatively deep atomic electrons. For outer electrons the wave func- 
tions of neighbouring atoms are so strongly overlapped that the tight binding 
approximation is not valid. In other words, the probability of finding the elec- 
tron near any point of the crystal lattice has almost the same value. 

In this case it again turns out to be possible to carry out the calculation of 
the energy spectrum of the electron in the approximation of nearly free elec- 
trons. Namely, in the zero order approximation we do not take into account 
the action of the periodic field of the crystal lattice at all, and consider the 





252 SOLID-STATE THEORY Ch. S$ 


electron to be free. Its normalized wave function and its energy are of the 
form 


WA A?k? 
Yo= V-i eik-r 3 E%Ck) = oe 





Here unperturbed states are, as a rule, degenerate. This is particularly obvious 
in the example of a cubic crystal. As a matter of fact, in correspondence with 
the foregoing (see §48), the vector components k,, ky, k, can take on the 
following values: k; = (27/a) l; where l; takes on integer values. 

Thus if there is a vector k directed along the x-axis and equal to 27/)/a, 
where lọ is a certain integer, then there are a number of other vectors having 
the same magnitude but differing in their directions. To find the perturbed 
wave function,.we have to make use of perturbation theory for degenerate 
states. We set 


y= Diy Q, gin, (49.15) 


where the summation is carried out over all vectors k whose ends lie on the 
sphere |k| = const. To determine the first order correction to the energy ac- 
cording to perturbation theory we have from formula (54.4) of Part V 

Oy kx- EMS, = 0, (49.16) 


where the matrix element Uj, pis equal to 
= L fei(k—k): 
Up= yS VIUA. 


Let us at first suppose that among the vectors k over which the summation 
in (49.15) is carried out and whose ends lie at the surface of the sphere 
Ik] = const. there are no vectors satisfying the relation 


k'—-k=K. (49.17) 


In this case all integrals U% x’ are small except that of the diagonal matrix ele- 
ment Uk, p: This can be demonstrated in the following way. The matrix element 


Uk w= 7 f Veit-ray 


§49 ENERGY SPECTRUM OF AN ELECTRON 253 


is determined by integrating over all the space of the crystal. It is convenient 
to pass to integration over a unit cell 


1 T, 

Ug D Ueilk-k)-r dy, 
k,k V over cells J 
n 


We have to transform the integral over the nth unit cell, introducing the vari- 
ables r=n +r’, where n is the vector directed from the origin to the nth 
atom of the cell. Then we obtain 


Jumi- dV = eilk’—k)-n feik—w-r' U(r +n)dV'. 


Tn Tn 


Making use of the property of periodicity of the function U, we find that the 
right-hand side of the last equality is of the form 


eik-k)n f(r" ei(k'—K)-' gy! , 
To 


For the matrix element Ux. x we have 


= ile i(k’—k)-n i(k'-k)-r gy = 
Wy De fuae dV 
TO 
=L if U(r) cilk’=K)r gy 2) eilk'-k):n , (49.18) 
V n 


TO 


In this formula the integration is carried out over a unit cell, while the sum- 
mation is carried out over all lattice points. 
The sum B = X, eK —K)-n for arbitrary K — kis equal to 


N, if k’'-k=K 
B= D> eilk'-k)-n = | : $ 49.19 
n 0, if K-KÆK, € ) 


where N is the number of lattice points. 
The upper equality (49.19) is obtained at once after making use of for- 
mula (46.2). The lower relation can be obtained if n is replaced by nta, 





254 SOLID-STATE THEORY Ch. 5 


where æ is any vector of the lattice. For large values of N the quantity B can- 
not change when this substitution is made, since such a replacement is equiv- 
alent to a displacement of all lattice points 


B= eilk’-k): a DD eilk’—k)-n = DD ei(k'—k)-n , 
n n 


If k'—k #K, then it is obvious that B = 0. 
We now see that the only diagonal matrix element differing from zero in 
(49.16) is equal to 


] — 
Wk k= yf uoav=d. 


The quantity U represents the mean potential in the space of the lattice. 

Evaluating determinant (49.16) and solving the equation leads to the 
unique root £) = UV, In correspondence with eq. (54.3) of Part V, the coef- 
ficients Cy, can be chosen in the following way: 


1) Cy =l, Oy GEN 
2) Chaile Ck, = Cy, = + = 0 and so on. 


We can now write the expressions for the wave function and energy. They 
are of the form 


Yolk)=V- elk,  E(k)=EXk)+ U, p= E%k) +U. 


The formulae found are in the same form as the results which would have 
been obtained if we had straight away used the theory of perturbations with- 
out degeneracy. This is due to the fact that the solution of eq. (49.16) is the 
unique root EM = UV, 

The correction can be obtained using that approximation. We make use of 
the usual expression of non-degenerate perturbation theory 


= Spl lenge 
E(k) = E%k) + U + 2 OET a (49.20) 


In correspondence with (49.18) and (49.19), the matrix elements are differ- 


§49 ENERGY SPECTRUM OF AN ELECTRON 255 


ent from zero only if k—k’=K. Expression (49.20) can now be written in 
the form 


(Uy kek! 


E(k) = Ek) +U— 2) ig 
fs) 09) K (A2(2m)(K?2—2k- K) 





(49.21) 


The third term of formula (49.21) is a small correction, so that the spacing 
between the nearest levels is for practical purposes defined by the formula 
for free electrons. If, however, in one of the terms of the sum in formula 
(49.21) the denominator becomes zero, i.e. if for a certain K the following 
equality holds 


k'2?—k?=K?-2k-K=0, (49.22 


then this formula makes no sense. The correction to the energy E(k) turns 
out to be considerable, and the use of perturbation theory is inadmissible. 
The presence of large terms in (49.21) allows one to assume the existence of 
large spacings between neighbouring levels. 

Relation (49.22) means that for certain vectors k and k’ the following 
equality is fulfilled 


k'-k=K, Ik'l = Ikl. (49.22') 


Suppose that this equality holds only for one value of K, i.e. for a definite 
value of k and:k’. This means that the states of the electron have almost equal 
energies E(k) and E(k + K), i.e. are two-fold degenerate. In this case, to find 
the correction to the energy, one has to make use of perturbation theory for 
two-fold degenerate levels. Thus eq. (49.16) is now written in the form 


U — EQ), Ux x! 


Upon OD 


Evaluating the determinant and solving the equation, we find for the correc- 
tion to the energy 


EV= 74 Uy p- 


This means that the energy undergoes a discontinuity and that the spacing 
between neighbouring levels is equal to 2|U, 4. It is obvious that the dis- 








256 SOLID-STATE THEORY Ch. 5 


continuity in the energy occurs when equalities (49.22') are simultaneously 
fulfilled. Formulae (49.22') express the well-known Bragg reflection condi- 
tions. They single out in the crystal a definite direction of selective reflection 
of incident rays. Indeed, multiplying the first equality (49.22’) by the lattice 
vectors a, b, c, we see that formulae (49.22’) and (36.18) of Part IV are iden- 
tical. 

Thus, for a cubic crystal, in the approximation of nearly free electrons the 
energy as a function of the wave vector has the following behaviour. It in- 
creases with increasing k according to (49.21). When k reaches the values 
k,.= ky =k,=+7/a, Bragg reflection occurs. The energy undergoes a discon- 
tinuity, the width of the forbidden zone being equal to 2U. After this the 
energy is again expressed by formula (49.21). For k, = ky =k, =+2n/a the 
energy again undergoes a discontinuity, and so on. 








Fig. V1.9 


Fig. VI.9 shows schematically the energy spectrum of a nearly free elec- 
tron. The qualitative picture of the energy spectrum of a nearly free electron 
is the same as that of a strongly bound electron. A difference lies in the fact 
that in the first case broad allowed zones and narrow bands of forbidden ener- 
gies are obtained, in contrast to narrow allowed zones and broad bands of for- 
bidden energies in the second case. In real crystals the limiting cases of tight 
binding and nearly free electrons as well as the whole range of intermediate 
cases are found. However, in all cases the energy spectrum of an electron in 
the crystal has the character of allowed energy zones separated by regions of 
forbidden energy. 

The quantitative calculation of the form of the dependence of E(k) on k 


§49 ENERGY SPECTRUM OF AN ELECTRON 257 


for real crystals is a complex problem. A number of approximate methods 
have been devised for the calculation of the bands. Their description may be 
found in the specialist literature*. 

In the next section it will be shown that the general result obtained in the 
one-particle model, i.e. for the case of the motion of one electron in a crystal 
lattice, also remains valid in the case of a crystal containing a large number of 
electrons. Thus the energy spectrum of electrons in a real crystal has the 
character of bands**. 

The motion of an electron in a metal lattice or a semiconductor lattice can 
be characterized by its mean (group) velocity, which can be found by the 
general formula 


dw _ Elk) 


<| 


ək a(hk)* 


If the energy of a quasi-free electron is substituted for E(k), then we arrive 
at the formula U, = k,/m. In the case of a strongly bound electron the mean 
velocity is of the form vy = (2fa/h) sink,a . 

We sce that the mean velocity of the electron differs from zero in both 
cases. A strongly bound electron moves with a mean velocity which is smaller, 
the smaller ĝ, i.e. the smaller the width of the band. 

We now assume that the crystal is placed in an external electric field of 
strength €. The energy of an electron in the crystal will be equal to 


E=E(k)—e€-r. 
The mean acceleration of the electron 


mip OL ay 
D p ace 


turns out to be the same as that of a free particle. The meaning of this result 
is easily understood if the expression for energy balance is set up. The mean 
rate at which work is done by the field on the electron is equal to 


* A.H.Wilson, The theory of metals (Cambridge University Press, Cambridge, 1953); 
R. Seitz, The modern theory of solids (McGraw-Hill, New York. 1940); J.M.Ziman, 
Principles of the theory of solids (Cambridge University Press, Cambridge, 1964); 
H.Jones, The theory of Brillouin zones and electronic states in crystals (North-Holland, 
Amsterdam, 1960). 
** We shall dwell neither on certain essential difficulties of the band theory of crys- 
tals nor on certain general limits on its applicability, particularly to polar crystals. 


NM 
l 


258 SOLID-STATE THEORY Ch. 5 


yee OF. 
N-e- 3k Es 


On the other hand, this work only goes into an increase of the energy of the 
electron 


dE _ dE(k) dk 


dt ək dr’ 
Equating the two expressions for the rate at which work is done, we arrive at 
the expression 

dk sere: 

dt ht 
Here we assume that the increase of energy of the electron takes place within 
the limits of a given band. The absence of energy loss for the electron leads 
to its acceleration by the field. In other words, the electron moves in an ideal 
lattice without any resistance. In §55 it will be shown that the basic origins 
of resistance are in the thermal vibrations of the lattice, disturbing the regular- 
ity of the arrangement of atoms on the lattice points. 

In considering the motion of electrons in a periodic field we have disre- 
garded the size of the ions themselves, assuming them to be crystal lattice 
points. At first sight it seems surprising that such a rough approximation and 
the model of nearly free electrons resulting from it is a good approximation 
to reality. The size of ions is not very small as compared with the spacing be- 
tween them, the radius of an ion usually amounting to about one half of the 
lattice spacing. 

In the region of space occupied by the ion the wave function of the elec- 
tron must behave like the wave function inside the atom. The electron inside 
the atom has a large kinetic energy and, correspondingly, a small wavelength. 
Hence its wave function undergoes oscillations. 

It is clear that such a behaviour differs very much from the smooth behav- 
iour of the wave function of a nearly free electron. It turns out, however, that 
if we wish to describe the behaviour of the wave function of an electron in a 
region outside the ion and are not interested in its detailed behaviour over all 
space, we can make use of a method similar to that applied in scattering 
theory for small energies. We have seen in §93 of Part V that in this case the 
behaviour of a particle outside the region where its potential is large (outside 
the nucleus or atom) does not depend on its behaviour inside this region. 
Therefore the true wave function, which is the solution of the Schrodinger 
equation 


§49 ENERGY SPECTRUM OF AN ELECTRON 259 


h2 _, ) j i 
=g =f 5) 
( F v-++U) V=EŅ, (49.23) 


can be replaced by the fictitious wave function y satisfying the equation 


ns ee 
(- SEAN + Up.) y= Ey (49.24) 


with the same energy and a changed potential energy. The operator Ups is 
called the pseudopotential. If Ups is constructed in such a way that outside 
the region r < R it is the same as U, Ups= U for r> R, then outside the ion 
the functions Y and y satisfy one and the same Schrödinger equation. By 
choosing the pseudopotential Ups properly, one can make the fictitious wave 
function y extrapolate smoothly into the region r < R. The smoothed wave 
function y will describe the motion of the electron in the crystal. Outside the 
ion it is the same as W, and the action of the field on y in the region r < R is a 
weak perturbation. 

The choice of the pseudopotential is not quite an unambiguous procedure. 
Correspondingly, y may also be of a somewhat different form. For y, use is 
often made of the expression 


y=D apik Dydd A(x), (49.25) 


where $,/(r) are wave functions for the internal states of the electron in 
the atom. These functions are strongly localized, and for r>R actually re- 
duce to zero. Hence here y is the same as the wave function of a quasi-free 
particle. The coefficients can be chosen from the condition of orthogonality 
of yand ¢y:. 

On substituting (49.25) into the Schrödinger equation, one can choose the 
expression for the pseudopotential U,,.. 

In the light of the above it becomes clear why the behaviour of electrons in 
a lattice is sufficiently well described by the model of nearly free electrons, 

In the cases where one deals with the properties of electrons associated 
with their motion in the crystal as a whole, the behaviour of the electrons can 
be described by the fictitious wave function y. This function in regions r < R 
only slightly depends on the weak field of the crystal lattice whereas for 
r > R it is the same as the true wave function. On the whole, y displays a 
smooth behaviour, changing at distances of the order of the lattice constant, 
as is to be expected for quasi-free particles. 





260 SOLID-STATE THEORY Ch. 5 


§50. A system of electrons in a solid 


We can now go on to a discussion of the properties of a system of electrons 
in a solid. As we shall see below, the number of non-localized electrons 
moving throughout the volume of a crystal can vary over a very wide range. 
For metals it is very large, amounting to about 1023 electrons per cm3. For 
such high densities the criterion of degeneracy of the electron gas (see §80 of 
Part III) V(2nmkT)? [N(2mh)3 < | turns out to be fulfilled up to tempera- 
tures of the order of 4—5 thousand degrees. This means that for such a high 
density the electrons always form a degenerate Fermi gas. 

If, however, the kinetic energy of a degenerate Fermi gas is compared with 
its Coulomb interaction energy, then, as was shown in §79 of Part III, we 
have the relation 


Emax _ h? F v e? ~ (¥)( h2 )~1 
(e2/r) 2m \8r V (VIN); yV e2m 


for N/V ~ 1023. 

Thus the Coulomb interaction between electrons in a metal is strong, and 
one should speak of an electron fluid rather than of an electron gas filling the 
metal. However, since this fluid is formed by electrons moving on a back- 
ground of positively charged ions, as a matter of fact it is more correct to 
speak of a plasma filling the volume of the solid body. The most important 
difference between the plasma of a solid and a gas plasma is the fact that the 
former is degenerate and that quantum effects play an important role in it. 
Nevertheless, as we shall now see, the most important properties of a degener- 
ate quantum plasma and a classical plasma turn out to be similar. This at least 
applies to the properties of screening and the existence of plasma waves. In 
order not to complicate the calculations, we shall confine ourselves to the ap- 
proximation in which the charge of positive ions can be considered to be uni- 
formly ‘spread’ throughout the crystal and to form a positive background. 
Taking into account the discrete distribution of positive charges does not in- 
troduce any essential change in the result obtained. 

Let us consider a system of free electrons, with the distribution function 
Ke), and a positive background in a certain volume acted upon by an external 
variable field. We shall assume that the applied external field produces in the 


system the potential 





= lim œw) eil r-o ebt dq dw. (50.1) 
v= lim f fea.) q 


§50 A SYSTEM OF ELECTRONS IN A SOLID 261 


Here ¢(q, w) represents the Fourier component. The factor eê’ means that 
the field is adiabatically (slowly) applied during the time interval from 
t>— to t=0. For such a case the field will not give rise to transitions in 
the system up to the instant of time ¢ = 0. For t > 0, transitions will occur in 
the system or, in other words, a response to the applied perturbation will 
arise. This response is conveniently characterized by the dielectric constant 
e(q, w). To calculate e(q, œ) it is convenient to express it in terms of the 
density of the charge induced in the system by the external field. 

Using formula (31.12) of Part IV, the obvious expressions for Ohm’s law 
and the continuity equation in Fourier components 


J(q, ©) = 0(q, œ) E(q,w),  iq-j(q, w) = iwp(q, w), 


and the relation E(q,w) = —iqy(q,w), where y(q,w) is the Fourier compo- 
nent of the scalar potential, we obtain for dielectric permeability 





` 4rio(q, œw) _ 
e(q,w)=1+ a 
= 4nij-E 4ni q:E el 4rp(q, w) 
=] + =1+ pq, w) = 1- =. (50.2) 
wE? GÈ abe Sa qq, w) 


Our next problem is to calculate the Fourier component of the charge density 
p(q, w)*. 

With this end in view, we shall consider a free electron in the initial state 
with wave vector k and energy e(k). 

The wave function of an electron in an unperturbed state can be written 
in the form 


V_ = V-t eik-r eie(k)t/ħ | 


Let us consider the perturbation produced by an external field of frequency 
w, i.e. let us assume that the perturbed Hamiltonian is of the form 


H'= ey(q, w) e&(eiq-1-w) + eil r+ wt) | 


* J.H.Ziman, Principles of the theory of solids (Cambridge University Press, Cam- 
bridge, 1964) ch. 5. 





262 SOLID-STATE THEORY Ch. 5 


For a field applied in such a way, no transitions will be caused in the system 
for times up to ¢=0. For t > 0, under the action of the perturbation the free 
electron assumes the wave function 


YO = Wk + CD Vag +C) yg (50.3) 
In first order of perturbation theory (see §55 of Part V), we have for cı and 
2 


t 
=! yı —i[e(k)—e(k+q)] ¢/f 
TORE if (H Dae i[e(k)—e( +a)) e/h ay 

— 00 


The matrix element is taken with respect to the time-independent electron 
gas wave functions, It is equal to 


(Hk k+q = 
= ep(q, w) [ fetter elQ-t— wl) edt eik-r dy + | = ep(q, w)e , 
Analogously 

(A) k—q = AG, w) el" . 


Thus 


t 
c(t) = eA ©) | Í edt eife(k)—e(k+q)+hw} t/h ar) > 


The factor eê’ ensures the convergence of the integrals at the lower limit. 
Carrying out the integration, and then passing to the limit ô > 0, we finally 
find for c;(t) 


c (9 = ep(q, w) ei fe(k)—e(k+q)+hw} thh 
e(k) —e(k + q) thw 





(50.4) 


Analogously 
eq 3 w) ei {e(k)— e(k— q)—hw} tlh 


©) = ~~ (k) =e{K= 9) = Feo 


(50.5) 





§50 A SYSTEM OF ELECTRONS IN A SOLID 263 
Let us now find the electron density induced by a perturbation in a system of 


free electrons. It is obvious that for one electron the induced charge density 
can be written in the form 


Ap = e {IYDI IY ~ 
f3 a * 1% J * * 
OCW gic FCM k-qYk tee Mk + CoM E_-g Yk} 


Here we have dropped the terms of the second order of small quantities c? 
and Ch . Substituting the values of ci and cy from (50.4) and (50.5), we find 


—i(e(k)—e(k+q)+hw ) t/ñ 


=—e2 
oe SAITO) TIRE A 


X e~ik+q)-r eik-r gie(k)i/fi e—ielk+q)t/ħ + 


= eq, w) (e~ila rtr) + ei(Q-t—wd) x 


1 pis lg 
X &(k) —e(k + q) + hea * &(k)—e(k—q) —feo* 





(50.6) 


The total charge density induced in a system of electrons by an external field 
is equal to 


(a, w) =f Apitk) dk = — ey(q, co) (it-0) + eilat+ oD) x 


f(k) dk fk) dk 2 
X Seas —e(k + q) + hus Se —e(k — q) —hw | p 





= —e*(q, w) (eila r= wt) + eld-t+ wt) x 


fUk) dk fk’ + q) dk" a 
X [So Er S w + q)—e(k) — male 





= — c29(q, w) (T=) + eilad) f aoe aai 


(50.7) 


In the second integral we have made the replacement of variables k > k’ + q 





264 SOLID-STATE THEORY Ch. 5 


and then put k’ > k. Substituting this value of p into (50.2), we find 





ae 4ne? fk) —fUk +q) 
e(q, ~)=1+ z Jz eorn (50.8) 


The dielectric constant represents a basic characteristic of matter. Using for- 
mula (50.8), one can find the response of the system to an arbitrary perturba- 
tion of the plasma. 

Let us examine the limiting cases w > O (constant field) and hw > E(k) — 
E(k + q) (high-frequency field). For w > 0 


4e f fk) —f(k + q) 
e(q,0)~1+ T I AE dk. (50.9) 





Here we shall confine ourselves to the case of long wavelengths q < k. Then 
we find 





OT DE 
Stk) —/k +q) = de ok q, 
(50.10) 
k koaa- Ye 
e(k)—e(k + q)=- ar ` 4- 
Hence 
eq, 0y=1 +37 f (2 L) p(e)de= 1 + Ste ple), (50.11) 
q? q? 
where p(ep)is the charge density at the Fermi surface, 
— ay 


Here we have made use of the property of the Fermi distribution f(e), which 
for a low temperature has the form of a step function (see §80 of Part III). 

For values of q which are not small the calculation of e(q, w) is more 
complex, but introduces no qualitative change into the character of this quan- 


tity*. 


* See, for example, D.Pines, Elementary excitations in solids (W.Benjamin, New York 
and Amsterdani, 1963). 


§50 A SYSTEM OF ELECTRONS IN A SOLID 265 


Let us now see what the form of the field of a point charge is in a medium 
with dielectric constant (50.11). If the charge is at rest or moving slowly, 
then there corresponds to its potential in vacuum e/r the Fourier component 


¥9(q) = 47e?/q? . (50.13) 


The potential of a charge in a medium can be written in the form 


„a7 r/ip 
ar, D= faq pan = (50.14) 


where the Debye length /p is equal to 


‘= (mre) 3 ( =a) $ Gols) 


Formula (50.14) shows that, as in a classical plasma, the field of a charge 
in a quantum plasma turns out to be screened. However, instead of the Debye 
radius 1/x, the screening length for a degenerate quantum plasma is given by 
formula (50.15). 

Estimates show that /p is about 1—2A. Thus the screening effect of the 
plasma ensures the fall-off of the interaction forces at distances of the order 
of the mean distance between electrons in the metal. The screening of the 
electron charge has an obvious meaning: near any given electron the concen- 
tration of other electrons is reduced. This decrease in concentration is due to 
the pure Coulomb interaction as well as to exchange forces. The latter have 
figured implicitly in the calculation of the dielectric constant; that is, the use 
of the Fermi distribution corresponds to taking into account the Pauli prin- 
ciple and the exchange interactions associated with it. 

A slow movement of an electron in a plasma is accompanied by a ‘running 
away’ of electrons from its path. Thus, not the electron but a whole group of 
particles, carrying a total charge equal to the charge of the electron, moves in 
the plasma. This system, consisting of the electron and a cloud of electrons 
moving together with it or away from it on a positive background formed by 
ions, can be considered as a quasiparticle with charge e and effective mass m*. 
As will be seen from what follows, a contribution to the effective mass is 
given not only by the interaction between the electrons but also with the 
phonons of the lattice. 

The existence of screening is of great importance for understanding the 
phenomena which occur in metals. Namely, owing to the screening, the 








266 SOLID-STATE THEORY Ch. 5 


forces of interaction between electrons at relatively small distances are sub- 
stantially reduced. 

We shall return to the discussion of this problem somewhat later, and in 
the meanwhile we shall consider the form assumed by e€(q, w) at high fre- 
quencies. 

For this we transform (50.8) into a somewhat different form, writing for 
Ww — 00 


1 1 = 
ek) —e(k + q) +fiw e(k)—e(k-q)—ħw 





2e(k) —e(k + q) —e(k—q) 
~ Tek) ek + q) + hes] [e(k) — e(K— q)— fiw] 





_ 2e(k) — e(k + q)—e(K— q) 
(ha)? 





Hence, if w > œ, we obtain for e(q, w) 
ZEZ 
€(q, 0) > 1 -w2 (50.16) 
where Wp denotes the quantity 


w= ane 





a fk) [2e(k) — e(k + q) — e(k — q)] dk. (50.17) 
Expanding e(k + q) ina series, we can write 
02¢€ h 
2e(k) — elk + q) — e(k —q) = | q2? =+ q? 
(i) —e(k + q)—e(k—q) = 5g? = 


so that 
wr = 4nNe2/m™ ; 


Here we have made use of formula (49.14) for determining the effective mass. 
Comparing w, with the frequency of plasma oscillations of the classical plas- 
ma (46.16) of Part IV, we see that they are identical. The dispersion equation, 
defining the relation between the frequency and wave vector, according to 
(33.18) of Part IV is of the form e(q, w)=0. Here we assume the plasma 


§50 A SYSTEM OF ELECTRONS IN A SOLID 267 


waves to be longitudinal (see §33 of Part IV). This gives w = %w,. Thus in a 
quantum plasma, as in a classical plasma, undamped (more precisely, un- 
damped in our approximation) plasma oscillations can exist. 

The meaning of this result is the same as in classical plasma theory: if a 
sufficiently large group of negatively charged particles of mass m”™ is displaced 
with respect to a positively charged background, then the displaced particles 
will begin to oscillate about an equilibrium position with the plasma fre- 
quency. 

If we did not confine ourselves to the first term of the expansion in the 
denominator of (50.8), then instead of (50.16) we would obtain a more com- 
plex dispersion equation from which it follows that plasma oscillations with 
frequencies different from Wp are possible. The quanta of plasma oscillations 
are called plasmons. 

Let us now see what the energy conditions for arousing plasma oscillations 
are. In order that an electron may give rise to collective oscillations in a sys- 
tem, i.e. emit a quantum with frequency w, and wave number q, it is neces- 
sary that the following energy conservation law be fulfilled: 


hi2k2 ? h(k = q)? 





2m 2m ī nop 
or 
h? (2k-q—q2)=hw,. (50.18) 
p 
2m 


Since for electrons the wave vector |k| does not exceed its value at the Fermi 
surface, kp, we see that condition (50.18) is not fulfilled for small values of q. 
This means that plasma oscillations in a system of free electrons can be 
brought about only for q > wpm/ħkp = p/Up, where vp is the velocity at the 
Fermi surface. The energy of a quantum of plasma oscillations ha, ~ 20 eV. 
This means that an individual electron with only thermal energy cannot give 
rise to plasma oscillations in a metal. This is not surprising, since plasma oscil- 
lations correspond to the collective motion of large groups of particles. If, 
however, a charged particle, for example an electron, possessing a sufficiently 
large energy, enters a metal, then such a particle may give rise to the plasma 
oscillations considered here. 

This fact has made it possible to carry out. successfully, experiments in 
which oscillations are excited in the plasma of solids. However, it should not 
be concluded from the above that plasma oscillations are absent in a system 


ije 


iy 


"H 
i 


268 SOLID-STATE THEORY Ch. 5 


of electrons in a metal. In contrast to the classical plasma, zero-point oscilla- 
tions having energy hw, may exist in a quantum plasma. These are zero- 
point quanta or plasmons. Large groups of electrons take part in a correlated 
way in zero-point oscillations. 

Let us estimate the order of magnitude of the size of such a region. The 
wavelength involving a group of particles, Amin = 27/4, must be larger than 
the mean screening distance /p. Using (50.18) and (50.15), we obtain Amin > 
Up/@,- No shorter plasma waves can be excited. 

Thus it can be said that electrons at relatively large distances, larger than 
min» are involved in plasma waves. 

We have now disposed of the essential information about a system of elec- 
trons in a solid, and are ready to make a more substantial evaluation of the 
role of the interactions between them in the general behaviour of the system. 
We write the part of the Hamiltonian characterizing the interaction between 
electrons in the form of two terms 


wD oe De = 
izj lY- rejl i#j lr;— rl i#j lr;— r| 


À 


lrj=rį> Amin Itj—tA<Amin 


The first sum involves only terms in which the distance between the particles 
is larger than Amin: At low electron energies € < €p, this interaction gives rise 
to zero-point plasma oscillations. This sum brings a certain constant term into 
the total Hamiltonian. The second sum involves pairs inside the screening 
sphere. Such pairs interact with each other according to (50.14). Hence the 
corresponding term gives only a very small contribution to the total Hamil- 
tonian. Calculation shows, for example, that the change in thermal capacity 
of the electron gas due to the screened interaction amounts to only a few 
percent. Thus the general conclusion which can be drawn from the above con- 
siderations is that, owing to the screening effect, the effect of the interaction 
between electrons on the properties of the solid turns out to be relatively 
small. This applies particularly to electrons having an energy close to the 
Fermi energy. These electrons always have the possibility of pushing the 
neighbouring electron cloud away from them and of moving together with 
this ‘positive hole’. 

Calculation shows that such a system has a considerable lifetime. This fact 
allows one to consider the system (electron + positive hole) to be a quasi- 
particle. The long lifetime is associated with the Pauli principle which prevents 
the electrons constituting the cloud from changing their state. The appearance 
of mobile quasiparticles is closely related to the thermal excitation of the sys- 





§51 MODELS OF A METAL 269 


tem, when the transitions of electrons into unoccupied states take place in it. 
Hence a system of electrons at 7 # 0 can be considered as an electron fluid in 
which elementary thermal excitations are moving. These excitations (quasi- 
particles), move independently of one another, have charge (— e), mass m* 
and spin } and obey Fermi statistics. For brevity we shall call these quasi- 
particles the electrons. 

We stress that disregarded factors, for instance the discrete distribution of 
positive charge, have no effect on the qualitative conclusions obtained. 

The interaction of the electron (quasiparticle) with phonons, as we shall 
see below, leads to the formation around it of a cloud or ‘jacket’ of phonons, 
which moves together with it, changing its mass so that m* > m**. The inter- 
action between electrons is directly manifested in a number of effects, for 
example the difference in the velocity of sound in a metal in comparison with 
a dielectric. The character of this interaction shows why the model of an ideal 
gas of free electrons correctly gives the basic properties of a system of elec- 
trons in metals. The screening substantially reduces the interaction between 
electrons. The major effect of the interaction amounts to a change in the ef- 
fective mass. 


§51. Models of a metal, a semiconductor and a dielectric 


We can now go on to a discussion of the properties of a system of elec- 
trons in a solid. 

In a qualitative description of the behaviour of a system of electrons we 
can, on the basis of the results of the preceding section, replace it by a system 
of quasiparticles (fermions). In what follows we shall call these quasiparticles 
electrons, which, however, should not lead to misunderstanding. The number 
of non-localized electrons per cm? can be determined by direct measurements 
(see below). For the so-called ‘good’ metals such as the alkali and alkaline- 
earth elements, silver, copper, gold and a number of others, the number of 
non-localized electrons per cm? is approximately equal to the number of 
atoms per cm3, i.e. the number of such electrons per atom is equal to one. 

For a number of other metals the number of ‘free’ electrons per atom 
turns out to be considerably less than one. Furthermore, a marked anisotropy 
is displayed, so that the properties of the system of electrons turn out to be 
different in different crystallographic directions. As an example, we mention 
bismuth, which exhibits a pronounced anisotropy in its electric and magnetic 
properties. 

For metalloids the number of ‘free’ electrons per atom is so small that the 





270 SOLID-STATE THEORY Ch. 5 


system of electrons forms a non-degenerate ideal gas. Finally, in a number of 
good dielectrics, for example in NaCl or solid oxygen, there are practically no 
free electrons. Although the density of free electrons is of decisive importance 
in the behaviour of the system of electrons in a solid, it would be incorrect 
to assume that the overall difference between metals, semiconductors and in- 
sulators amounts to the variation of this characteristic and is of a quantitative 
character. As a matter of fact, it has a profound meaning. This is seen, for ex- 
ample, from the qualitative difference in the mechanism of electric conduc- 
tion or in the magnetic properties of metals, semiconductors and dielectrics. 
The profound difference in their physical properties is associated with the dif- 
ferent character of the energy spectrum of their electrons. 

We have seen that the electron spectrum in a crystal has the form of alter- 
nating bands of allowed and forbidden energies. It should be recalled that in 
each allowed energy band there is a limited number of states. Namely, there 
are 2N states in each band, where N is the number of unit cells per unit vol- 
ume, and the factor 2 is associated with the two electrons occupying each 
state. If all the states of a band of allowed energies are occupied by electrons 
in pairs, then the electrons cannot change state and go from one state to an- 
other under the action of an applied external field. Such a body behaves as an 
insulator. But if in the allowed band only a fraction of the states are occupied 
by electrons, then transitions between states are possible and the body be- 
haves as a metal. 

Let us consider schematically the formation of a metal by separated non- 
interacting atoms. In these atoms let there be one valence electron in the 
highest occupied energy level. Its state will be two-fold degenerate, since the 
energy of an electron does not depend on the orientation of its spin. In a sys- 
tem of N independent non-interacting atoms the corresponding energy level 
will be 2N-fold degenerate. When the atoms are drawn together and an inter- 
action is established between them, the level splits into 24 adjacent levels 
forming a continuous band. One half of these energy levels will be occupied 
by electron pairs, while the other half will be vacant. Thus this band of energy 
levels, called the conduction band, arises from the lower energy states of the 
valence electron. The excited states of the valence electron, splitting into a 
large number of levels, form other bands which are not occupied by electrons. 
In certain cases the broadening of bands is so large that they overlap each 
other. Between the allowed energy bands there are forbidden energy bands. 
The crystal formed will possess metallic properties, since the unoccupied 
states are directly adjacent to the occupied states. Such is the case for the 
alkali metals, copper, silver and some other metals. 

Suppose now that the atom has two outer electrons in one energy state 


§52 THE PARAMAGNETISM OF AN ELECTRON GAS 271 


with oppositely oriented spins. The states of the electrons in the atom will be 
non-degenerate. The corresponding state of a system of N non-interacting 
atoms will be 2/V-fold degenerate. When the crystal is formed this state splits 
into 2N close levels occupied by electron pairs. A crystal with such an arran- 
gement of levels is an insulator. In it the occupied energy levels are separated 
from the unoccupied ones by a region of forbidden energies with an interval 
Ae. In order that a thermal excitation may bring an electron from an occu- 
pied into an unoccupied state, the thermal energy kT must be of the order of 
AE. The same applies also to the excitation of electrons by an electric field. 
The values of the corresponding temperature or field strength turn out to be 
very large. For ordinary temperatures and fields, the electrons remain in oc- 
cupied levels and cannot carry a current. Thus a dielectric differs from a metal 
not by the total number of electrons, but by the character of the arrangement 
of the allowed energy bands. 

However, it should not be concluded from the above that atoms with two 
outer electrons form an insulator type crystal when they are combined. 

In addition to the case discussed, there is possible a band structure in 
which, owing to the overlap of the bands arising from the normal and ex- 
cited states of the atom, the unoccupied band is directly adjacent to the oc- 
cupied one. Materials of this type are metals such as the alkaline-earth metals, 
lead and a number of others. Current theory does not enable one to predict 
which one of these two cases will result when atoms with the properties given 
are combined. 

It goes without saying that the division of crystals into metals and dielec- 
trics covers only two limiting cases. The whole range of intermediate proper- 
ties between metals and dielectrics belongs to the semiconductors. In the case 
of semiconductors, the width of the forbidden zone is relatively small and be- 
comes comparable with the energy of thermal excitation at a relatively low 
temperature. We shall dwell in more detail on the energy spectrum of semi- 
conductors in §67. 


§52. Magnetic properties of metals. The paramagnetism of an electron gas 


It turns out that a number of important results can be obtained from the 
simplest model of a metal in which it is considered as a potential well with 
infinitely high walls, filled with a gas of free electrons. In particular, this 
crude scheme proves to be sufficient for discussing the magnetic properties of 
metals. 

It turns out that the magnetic properties of metals are determined in the 


“ee 


272 SOLID-STATE THEORY Ch. 5 


first place by the behaviour of the non-localized electrons. The interaction of 
the electrons with the lattice has a relatively small effect on the magnetic 
properties of metals. 

Let us consider the behaviour of a degenerate electron gas placed in a mag- 
netic field. In such a system, two basic effects are shown. One of these is as- 
sociated with the fact that electrons have spin, and the second with the quan» 
tization of the electron orbital motion in a magnetic field. We shall begin with 
the first effect. 

When an external magnetic field is applied, preferential orientation of the 
spin moments takes place in the field. As a result of this, a magnetization 
arises in the system. 

To calculate the magnetic susceptibility, we shall write, first of all, the ex- 
pression for the free energy of the free electrons in a magnetic field. Accord- 
ing to the Gibbs—Helmholtz formula (30.12) of Part III, we have 


T 
F=-T f eau” (52.1) 
0 


where 


1 edy 
ps. 0 ea 52.2 
(27h)? VEEE 622) 


The summation is carried out over all (i.e. over two) possible orientations of 
the spin magnetic moment pg with respect to the field H. We recall that in 
the absence of a magnetic field the energy levels were degenerate and that the 
factor 2, instead of the summation, stood in the expression for the energy 
(80.2) of Part III. 

Thus 


E dy € dy 
Baf = Ge ee ee ; 
ETE, (2mh)? AAR (2nhy3 (52.3) 





On substituting (52.3) into (52.1), we arrive at the calculation of integrals of 


§52 THE PARAMAGNETISM OF AN ELECTRON GAS 273 


the type 





tal 


Tt co 
ff edy fi = dT =f dy k f e du 
(2m)? 6 TACE CERI 1) (27y)? Aai ele Ht Holu 








f dy pap aa ita dy e (u — HoH) du 


(27h)? 7 ee HtHoM)u . | (27h)? Ja eH tHoM)u 4 | 


eee HODES 


= N(u — HoH) — fe 


In(l+e- 





(27)? 


where N is the number of particles. Hence for the free energy we can write 


F=uN-kT Dy f inc +e Get HoT | (52.4) 
(nny 


Let us consider the case of a weak magnetic field, for which the following 
inequality holds: 


HoH/kT <1. (52.5) 


Expanding (52.4) in a series in powers of this small quantity, we obtain 





2 dy (wk 
F n aa ea Obie <—IKT) + 


In(1 + e~(€-BV/KT) | (52.6) 





+ gar 
le (27h)? 


We calculate the integral 








wf dy sIn (1 +e“ @-WIKT) = 27 Omi J ehin + e~€-WIKT) de. 
(27h)? (27h)? 0 








274 SOLID-STATE THEORY Ch. 5 
Integrating by parts twice, we have 


co 


if e2 In(1 + e~(-WIKT) de = et In (1 + e~€-W/KT) si = ede 
% S l= W)IKT + | 








2 o) sof i 
E aad ees EET 


by virtue of the property of the Fermi distribution (see (80.4) of Part II). 
Hence 


= 8n(2m)? Vu 
15(20h)3 kT 


and for the free energy we get 


8n(2m)? aut 
ISTA) du? 


] 16n(2m)? Vp? Vr 


F=Nu+ 
O 15(2mn)3 


(52.7) 


(uom? v Žr 


According to (18.3) of Part IV, the magnetic susceptibility related to the 
orientation of spins is equal to 





1 OF _ 74n(2m)? p? 


=— Z 2 
XS" VH 8H "0 Carme 625) 
The formula for the susceptibility can be written in another form: 
Xs = HAM /KT » (52.9) 


where for npp the effective number of unpaired electrons, we have made use 
of formula (80.17) of Part III. 

This last expression has an obvious meaning. Namely, (52.9) is the same as 
the susceptibility of a gas of 7f particles which freely orient themselves in a 
magnetic field and have intrinsic magnetic moment 4g. 

We see that the electron gas has a spin paramagnetism independent of tem- 
perature. It should be stressed that this result is closely associated with the 
degeneracy of the electron gas. 


§53 THE DIAMAGNETISM OF AN ELECTRON GAS 


N 
~ 
n 


§53. The diamagnetism of an electron gas 


It turns out, however, that in addition to spin paramagnetism an electron 
gas also displays orbital diamagnetism. 

Orbital diamagnetism, discovered by L.Landau, does not have such an ob- 
vious origin as spin paramagnetism. To calculate the magnetic susceptibility 
associated with orbital motion, it is necessary to find the free energy of the 
electron gas in a magnetic field. For this it is, in turn, necessary to find the 
energy of a free electron in a uniform magnetic field. 

Let us consider the solution of the Pauli equation in the simplest case of 
the motion of a free electron in a uniform magnetic field. We choose the 
direction of the field to be the z-axis, and write the vector potential in the 
form 


e== Hy Al=Ab= 08 


For our purposes it is more convenient to write the vector potential in this 
form than in the form (19.16) of Part I. It is clear that the two expressions 
for the vector potential are equivalent. 

The Pauli equation for stationary motion can be written as 


2 2 


1 e Py Pz e 
— = —— + — — — = Fù 
p (0, i n») E 2m i 2m me He VEE (53.1) 


From eq. (53.1) it follows, first of all, that the coordinate wave function and 
spin wave function are independent. 

An operator of the form const X s, does not act on the variables x, y, z, so 
that eq. (53.1) can be satisfied by a wave function of the form 


Y = ¢(s,) $x, y, z). (53.2) 


Since the coordinates x and z are not involved explicitly in eq. (53.1), we can 
try to find its solution in the form 


= eA) PxxtPyz) to) . (53.3) 


The momentum components py and p, in the direction of the x-axis and z- 





276 SOLID-STATE THEORY Ch. 5 


axis are conserved: 
P,H—Hp,.=0, p,H—Hp,=0 


and can run over a continuous sequence of values. 
Substituting (53.3) into eq. (53.1), we obtain 


B oy) + (Ey —4mwy —¥9)) 6) = 0, (53.4) 
2m 
where 
z 
=E- 2m HoszH , 


w=eH/mce, Ug=eh/2mc, (53.5) 


Yo = —¢p,/eH . 


Eq. (53.4) is formally the same as the equation of motion of a linear har- 
monic oscillator with frequency w (equal to the cyclotron frequency), oscil- 
lating about the equilibrium position yọ. Hence, without reproducing the cal- 
culations of §10 of Part V, we can write 


E,=(n+})hw (53.6) 
or 
2 2 
P; P; 
ETE + 4) + pos, H = 2ugH(n + 3) +54 —uos-H (53.7) 
and 
$0) = Hy [o — yo)] e Wel O-yo) , (53.8) 


where H,, are Hermite polynomials. 

Since. ‘the polynomials H,, decrease rapidly with increasing w(y — yg) and 
become small for (y — TD >Vhi mw (see (10.2) and (10.15) of Part V), 
then, in an uniform magnetic field the particle moves freely in the direction 


§53 THE DIAMAGNETISM OF AN ELECTRON GAS 277 


of the z-axis and performs a motion in the limited region 
Yor- (/mw): <y Syo + (hi/mw)? : (53.9) 


The motion in this limited region corresponds to the classical motion of a 
charge in a circle with its cyclotron frequency. To the motion along the field 
there corresponds an energy p2/2m, while to the motion in the xy-plane there 
corresponds the quantized energy Æ. The energy of the electron does not 
depend on the value of the momentum p,, so that the states are degenerate. 

We shall apply these results to the motion of an electron in the limited 
region (53.9) of space along the y-axis. In this case the momentum compo- 
nent py, in contrast to the component p,, cannot take on arbitrary values, In- 
deed, the position of the ‘equilibrium’ point yg cannot be outside the limits 
of the size L of the region in the direction of the y-axis. Therefore, from 
(53.5) it follows that 


O<p,<SL. (53.10) 


Knowing the energy of an individual electron, which is composed of two in- 
dependent parts: the energy of orbital motion, and the energy associated with 
the intrinsic magnetic moment, we can write the free energy of the electron 
gas. That is, we can set 


F=Fo+ F,+ Fowo (53.11) 


where F, is the part of free energy due to the spin magnetic moment already 
found, and F,,, is the contribution of the orbital motion to the free energy 


2 
p; 
Fow =—KT fay 2 Q; In [: + exp [ao + D-3] |r] 
(53.12) 


The peculiarity of the system being considered lies in the fact that the mo- 
mentum p, in the direction of the field may vary continuously, whereas the 
motion in the xy-plane is quantized. Hence the integration is carried out with 
respect to the momentum p,, and dy’ = dz dp,/2mh. For given energy e€, the 
states of orbital motion in the xy-plane are degenerate. The multiplicity of 


| 





278 SOLID-STATE THEORY Ch. 5 


degeneracy 2(€,,) is equal to 


ee ee ajocar (nny? (fe o) Pı 


P2 








Pı 


where the momentum p, corresponds to the motion in the xy-plane for a 
given energy £}. 

In other words, it can be said that the multiplicity of degeneracy of the 
energy level Æ, is determined by a large but discrete (for finite size of the 
metal sample in the direction of the x-axis) number of possible values of p,.. 

Writing the obvious relation 


2 


2ugHn < 5> =< Quo H(n + 3) 


2m 


we see that p) may vary (for given £,) in the interval from p} = (4mpgHn)? 
to p3 = (4p H(n +4))?. Hence we find 


(53.13) 





2(E,,) = - 


8rmugH ii 
ah)? 


The same result follows from (53.10). 
Substituting the values of dy’ and Q(E,,) into (52.12), we find 


8rmpgHV x 


F 
(27h)? 


orb ™ 7 


ofp Co [+f —2uoH(n +} -5 salle (53.14) 


Let us consider the case of weak fields, when inequality (52.5) is fulfilled. 
In this case the logarithmic function of the argument UgH/kT changes slightly 
when n is replaced by n+}. Hence the summation can be replaced by integra- 
tion using Euler’s formula 


co 





D fin +4) = JS -Aro 
nE 0 


0 


§53 THE DIAMAGNETISM OF AN ELECTRON GAS 279 


By means of Euler’s formula we get 


Ds j 
| u- 2uotiin + ae ~ 


> In (i +exp 


n=0 











Hence 


8rmugoHV 
PETROV 


PASE 
on (27h)? 


x fw, [foots [i exe (( a-zo- 2 Ver |] + 


0 
2ugHnmpg HV = p2 | = 
+ ERES iis [exp [( oF -u )/xr] +1 dp, . 


(53.15) 


In the first neal we introduce the new variable u = 2ugHx, and in the sec- 
ond we set € = P; 2/2m. Then we easily find 


_ kT(2n) (2m): V 


F Xx 
orb (2ahy3 
co co 
x J dp, J du In [! + exp [(u- u-5h = ers 
e 2n(2m)? (uo? V f (OORE (53.16) 
6(2Th)? 0 er e(€-H/KT + | 


|" 


í 








280 SOLID-STATE THEORY Ch. 5 


The first term of (53.16) does not depend on the magnetic field and gives 
no contribution to the magnetic susceptibility. According to (80.7) of Part 
III, the integral in the second term for a strongly degenerate electron gas can 
be written as 


(e)? de SE 
EED 


The susceptibility of an electron gas associated with the orbital motion turns 
out to be equal to 


1 F op 4n(2m)? pêp? 
a= =— : (53.17) 
or VH dH 3(2nfi)3 





We see that the orbital magnetic susceptibility is negative, i.e. corresponds 
to the diamagnetism of the electron gas. 
Comparing (53.17) and (52.8), it is easily seen that 


IXorbl = |x, . 
Hence the total susceptibility of an electron gas turns out to be equal to 


; 87(2m)? wr? 63:18) 
Magia Gs cA : 
z 3(2nh)3 


This value of x proves to be in good agreement with experimental data. 

A more detailed treatment taking into account the effect of the crystal 
lattice does not affect the general result. We stress that in calculating the dia- 
magnetic susceptibility we assumed the magnetic field to be weak. In strong 
magnetic fields, when inequality (52.5) is not fulfilled, the magnetic suscep- 
tibility turns out to be a function of the field strength. This function has an 
oscillatory character (de Haas—van Alphen effect). The oscillations are asso- 
ciated with the change of energy Æ, with the field and the corresponding 
change in occupations of the Fermi levels*. 


* See, for example, J.M.Ziman, Principles of the theory of solids (Cambridge Univer- 
sity Press, Cambridge, 1964). 


§54 FERROMAGNETISM 281 
§54. Ferromagnetism 


As was stressed in Part IV, ferromagnetic properties are possessed by only 
a relatively small number of metals. That is, only some metals with unfilled 
inner shells (transition metals) and some alloys turn out to be ferromagnetic. 

Among the macroscopic properties of ferromagnetic bodies the existence 
of a very large permanent magnetization is characteristic. 

The main feature of a more detailed description of ferromagnetic bodies is 
the correlation of the orientations of magnetic moments. 

From this point of view, such properties are possessed by antiferromagnetic 
and ferromagnetic substances as well as by ferromagnetics. In other words, in 
the broad sense all strongly magnetized substances are ferromagnetic. Meas- 
urements of the ratio of the magnetic moment to angular momentum have 
shown that for ferromagnetic substances it is equal to e/me, i.e. that it cor- 
responds to the spin nature of the magnetic moment. This suggests that strong 
magnetization is associated with the interaction of the spin moments of the 
electrons. 

The set of two basic facts; the spin nature of magnetization, and the de- 
cisive role of open atomic shells, leads to the following picture of the phe- 
nomenon: the energy spectrum of electrons in ferromagnetic metals has two 
bands. One of these, formed by strongly interacting valence electrons, has a 
large width and is a conduction band. The other arises from the interaction of 
electrons in unfilled shells. This interaction is relatively weak, since the over- 
lapping of the wave functions of the inner electrons is small. In correspond- 
ence with formula (49.13), the width of this band is small and it cannot be 
responsible for conduction. 

However, we have already seen (§19 of Part IV) that the spin of electrons 
of unfilled shells is easily reversed. Hence a relatively weak interaction be- 
tween the electrons may lead to alignment of their spins. 

Detailed consideration of this interaction is a complex problem. We shall 
confine ourselves to the discussion of the following model of a ferromagnetic 
material. Let there be a lattice of regularly arranged atoms, having one elec- 
tron in an s-state whose spin can have two orientations. In the absence of in- 
teraction the symmetrized wave function of a system of N electrons has the 
form (see (65.6) of Part V) 


w=)? De, [1 yr gt). (54.1) 
J 


where the vector index j denotes the number of the atom, and y(t) denotes 
the spin wave function corresponding to the ‘regular’ spin orientation. 





Z 


282 SOLID-STATE THEORY Ch. 5 


Suppose that in the ground state of the system all the spins have one and 
the same orientation, for example +. Such a completely ordered orientation 
of spins corresponds to the maximum magnetization of the system. We shall 
see below under what condition such a state of a system corresponds to its 
minimum energy. 

The interaction between electrons leads to a disturbance of the regularity 
of the orientation of their spins. Some of the spins turn out to be oppositely 
oriented. The perturbation being small means that the number of oppositely 
oriented spins is very small in comparison with the total number of spins. 

Let us consider a perturbed state in which the spin of one electron, for in- 
stance of the mth atom, is reversed. 

The wave function of such a state can be written in the form 


Ym = ND DO (DPP Ymm) Pm) L Yr) oF) , (54.2) 


where the prime on the product sign denotes the absence of one factor. Since 
the system is degenerate (i.e. the reversed spin can belong to any atom) then 
according to the general rules of perturbation theory for degenerate systems 
the wave function must be written in the form 


V= Cn ¥m. (54.3) 


The subsequent calculation is the same as that given in §49. The overall dif- 
ference amounts to taking into account the antisymmetry of the wave func- 
tion (54.2). Since the overlap of the wave functions is small, we shall restrict 
ourselves to taking into account the interaction of electrons of only the near- 
est neighbouring atoms (i.e. to the interaction of the electron of the mth atom 
with the electrons of the (m—1)th and (m+1)th atoms). 

Then, analogously to (49.8), we can write the system of linear equations 


(E'— Ey +a)Cm+t D Bam’) C 


m+m’ 


=. (54.4) 


where E" is the energy of the perturbed state, Ev is the energy of the ground 
state, and a and @ are the Coulomb and exchange integrals, 


a= = (VU YmdV, 


§54 FERROMAGNETISM 283 


B= Bm,m+1 = Bm,m-1 z 


—[¥m+1 pn) Yin tne DU Ym+itm) YnCms 1) Win IV nt s 


Here U’ is the energy of interaction between the electron of the mth atom 
and the electrons (atom core) of the (m+1)th or (m—1)th atom. 
The solution of (54.4) reads (see (49.9)) 


Crem (54.5) 


where the vector k is defined by the conditions of periodicity. Hence the 
wave function is 


w= Deikmy, (54.6) 
For the excited state energy we obtain in the case of a simple cubic crystal 
E' = Ey—a—28(cos kya + cos kya + cosk,a) (54.7) 


where a is the lattice spacing. 


To small excitation energies there correspond small values of the vector k, 
so that 


27-2 
E' = Eg—a—66— }pa?k? = Ey —a—68 -tE ; m*=ħ/Ba2. (54.8) 
4m 


The wave function (54.6) and energy (54.8) permit an obvious interpretation. 

We see that the excitation energy is connected with the exchange interac- 
tion of electrons. For the excited state energy to be higher than the ground 
state energy it is necessary for the following condition to be fulfilled, B > 0, 
i.e. the exchange integral must be positive. There corresponds to the excited 
state of the crystal the presence of a deviation of the spin orientation. Since 
all atoms of the crystal are equivalent, the spin deviation is not fixed to a de- 
finite atom but may wander throughout the crystal. 

The displacement of the spin deviation in the crystal from one atom to 
another is described by the wave function (54.6); the energy spectrum of the 
crystal is given by formula (54.7). The wave function (54.6) is called the spin 
wave. Formally the displacement of the spin deviation can be compared to 
the motion of a certain quasiparticle, called a magnon. Indeed, formally, the 
spin wave describes the propagation of a certain quasiparticle with wave vec- 
tor k in the crystal. The energy of this quasiparticle is given by formula 


284 SOLID-STATE THEORY Ch. 5 


(54.8), where m” is its effective mass. Magnons represent elementary excita- 
tions in a system of oriented spins. The energy of excitation of the entire 
crystal can be considered as the sum of such elementary excitations or as the 
energy of an ideal gas of magnons filling the entire volume of the crystal. 

It is clear that such a description of the excited state is approximate and is 
reasonable only for small degrees of excitation. This means that the number 
of magnons (the average number of spins excited into a reversed state) must 
be sufficiently small. Somewhat later we shall give quantitative character to 
this statement. 

All that we said in §47 apropos the description of the excited state of a 
system of particles by means of quasiparticles also applies to magnons. Mag- 
nons can interact with other quasiparticles, for example with phonons; or 
with real particles possessing magnetic moments, for example with neutrons. 
Magnons must be considered to obey Bose—Einstein statistics. Indeed, any 
number of electron spins belonging to different atoms may be in a state with 
given orientation. The number of magnons, like the number of phonons, is 
not conserved. To the excitation of the spin of one or another electron into a 
‘reversed’ state there corresponds the appearance (or disappearance) of a 
magnon. 

Hence the distribution function for the number of magnons with a given 
wave vector is represented by a Planck distribution 


1 dk 


k) dk = : 
HS (27)? eh? k?/2m*kT _ 1 


(54.9) 





Knowing that the character of the excitations in a crystal is associated with 
the exchange interaction of electrons with a free spin, we can come back to 
the discussion of the magnetic properties of such crystals. 

If the exchange integral is positive, then in the ground state all spins are 
oriented in one direction. In this case the crystal possesses the spontaneous 
saturation magnetic moment 


Mo = oN, 


where N is the number of electrons per unit volume. 

At T#0 some of the spins will be reversed. Clearly, the number of re- 
versed spins is equal to the number of magnons excited. The latter, according 
to (54.9), is equal to 


isk ese ik k? dk 
magn (2nh)3 J eh?k?/2m*kT _ 1 


§54 FERROMAGNETISM 285 


For low temperatures the integral converges rapidly, and the range of integra- 
tion can be replaced by an infinite range. This gives 


4n F k2 dk 
N = = 
sain (27)? J eh?x?/2m*kT _ 1 











1 (2 2 T \3 tT \? 
poni ( m al fa x ao Somes: (F) ~ iN (=) 
2n? ma \ B n2 B 
(54.10) 


Here we have made use of an approximate value of the integral and replaced 
1/a3 by the number of electrons per unit volume, and substituted for mm” its 
value from (54.8). 


The excitation of N magn Magnons reduces the magnetic moment of the 
crystal. It turns out to be equal to 
kT 

m=m (1-13 (g ja 54.11 

0 =A B ( ) 


Formula (54.11) defines the spontaneous magnetic moment of a ferromagnet 
as a function of temperature. At a certain temperature Ta, called the Curie 
point, the spontaneous magnetic moment must reduce to zero. It is clear that 


To © mB/k . 


If the exchange integral $ is assumed to be of the order of magnitude of e?/a, 
where a is the lattice constant, then To ~ 103 K, which is in agreement with 
experimental data. The temperature dependence of the spontaneous magnetic 
moment given by formula (54.11) is also in good agreement with experimen- 
tal data for low temperatures (T < Tọ). 

For temperatures T~ Tọ this theory loses any quantitative meaning, since 
the number of magnons becomes of the same order of magnitude as the total 
number of electrons. 

In spite of its extremely sketchy character, the theory of ferromagnetism 
given above not only explains the essence of the phenomenon but also allows 
one to draw certain quantitative conclusions. 

We have left out of account the anisotropy of the system associated with 
the anisotropy of the wave function of electrons in the unfilled d-shell. Also, 
we have not taken into consideration the spin—spin and spin—orbit interac- 





286 SOLID-STATE THEORY Ch. 5 


tions and the interaction with the electric field of the lattice, which also give 
rise to anisotropy and lead to the appearance of directions of easy magnetiza- 
tion. These effects are experimentally rather well investigated, but their theo- 
ty, which is associated with great mathematical difficulties, is to a great ex- 
tent still not worked out. Referring the reader for details to more specialized 
textst, we shall deal with only one problem here, that of the sign of the ex- 
change integral. 

As we have seen above, the condition 8 > 0 is of decisive importance for 
the existence of ferromagnetism. The exchange integral 6 can be written in 
the form 








p=e?fv;,a)vz,@[ + - > —— 


Ww, (2) ¥,.C1) dV, d3. 
"12 rin a us n ! 2 


The first term in the bracket gives the Coulomb interaction between the elec- 
trons, and the two remaining terms give the interaction of the electrons with 
the nuclei. For the integral to be positive, the intrinsic exchange energy must 
be large. This means that the wave function must be large at a large distance 
from the nucleus and small close to it. This condition is satisfied by electrons 
in d-states, which have a relatively large angular momentum. 

At the same time, the radius of the orbits of those electrons which are res- 
ponsible for the interactions must be small in comparison with the lattice 
constant. Otherwise the electrons will closely approach ‘strange’ nuclei and 
the negative terms will give a large contribution to the integral. 

These conditions exist for the atoms of the iron group, which are the most 
characteristic representatives of ferromagnetic substances. However, these 
rather stringent conditions are more often not fulfilled and the integral turns 
out to be negative. In this case the exchange interaction of the electrons of 
open shells leads to another kind of ground state. That is, in the ground state, 
neighbouring spins turn out to be antiparallel and spontaneous magnetization 
is absent. 

Such substances are called antiferromagnetic. Antiferromagnetics, possess- 
ing no spontaneous magnetization, display a number of characteristic proper- 
ties. We cannot, however, dwell on a consideration of these here, and refer 
the reader to more specialized textsTT. 


+ S.V.Tyablikov, Metoay kvantovoi teorii ferromagnetizma (Methods of the quan- 
tum theory of ferromagnetism) (Nauka, Moscow, 1965); J.M.Ziman, Principles of the 
theory of solids (Cambridge University Press, Cambridge, 1964). 

+7 J.M.Ziman, Principles of the theory of solids (Cambridge University Press, Cam- 
bridge, 1964). 





§55 INTERACTION OF ELECTRONS WITH LATTICE VIBRATIONS 287 
§55. The interaction of electrons with lattice vibrations 


We have discussed above the problem of the interaction of an electron with 
a crystal for a regular arrangement of ions at lattice points. The motion in the 
lattice of an electron and the hole associated with a quasiparticle does not 
differ from the motion of a single electron. The quasiparticle in the lattice is 
described by the same Bloch function. 

In what follows we shall discuss the problem of the interaction of the 
quasiparticle with lattice vibrations. In accordance with what was said at the 
end of §50, we shall call the quasiparticle the electron without further speci- 
fying it. 

Thermal lattice vibrations violate the periodicity of the potential acting on 
the electron. As the lattice vibrates the atoms undergo displacements from 
their equilibrium positions, which we characterize by the vector &,. 

We are interested in the value of the potential energy U of the electron at 
a certain point r of the lattice. It can be found for two limiting cases: in the 
approximation of deformable ions and in the approximation of rigid, unde- 
formable ions. The two models lead to very similar results. We shall make use 
of the first model which appears to be somewhat closer to the true state of 
affairs. In the model of deformable ions the entire lattice is replaced by a 
vibrating continuum, At a given instant of time the potential energy of the 
electron at the point r is determined by the instantaneous configuration of 
this continuum. To the displacements of the continuum there will correspond 
changes in the potential energy of the electron at the point r. That point of 
the continuum which in the equilibrium position had the coordinate r, after 
the displacement will go over into point r+&r). The value of the potential 
energy of the electron which in equilibrium corresponded to the radius vector 
r—€(r), after the displacement will correspond to the radius vector r. In other 
words, that potential energy which was originally ‘located’ at the point 
r—€(r) will ‘move’ to the point r. Obviously, the change in the potential ener- 
gy of the electron at the point r will be 


iat, = U(r—€) — U(r) = —E(r) VU , (55.1) 


if the displacement is assumed to be sufficiently small. It is natural to assume 
that for the vector (r) one can write an expression in which the vector n is 
replaced by the vector r. 

Let us consider by means of the method of second quantization a system 
of electrons each of which is acted upon by the perturbation Uja We shall 
disregard the Coulomb interaction of the electrons with each another. In cor- 








288 SOLID-STATE THEORY Ch. 5 


respondence with the results given in §99 of Part V, we can write the Hamil- 
tonian of the system of electrons in the external lattice field. Namely, we 
shall characterize the state of the system of electrons by the occupation num- 
bers of states with a given value of k. 

Using the second quantization representation (formula (99.23) of Part V) 
we write the energy of the system of electrons in the form 

Â= 2 Byala, + 2, âf â (k'l Unn IK) = Ay + Hing - 

Here £, represents the energy of an electron moving in a strictly periodic lat- 
tice field. The operators â and âk satisfy the usual anticommutation relations 
(99.29) of Part V. 

The interaction operator is determined by the matrix element of the ener- 
gy of interaction of the electrons with an external field, in the given case, the 
field of a vibrating lattice. According to formula (99.23) of Part V 


(k'lUnulk) = f Vie Viana dV - 
Hence the Hamiltonian is of the form 


H= 27 Bahay + 2 2 alâ, f Vk unY dV - (55.2) 


Substituting into (55.2) the wave functions Yẹ defined by formula (48.11) 
and the operator Utt from (55.1), we obtain for the interaction operator 


A a iy DeD ala, Juke EV) u, eT AV. 


Mi Ry Ge om 


Making use of the explicit form of the operator Ên (47.14), we write the inter- 
action operator in the form 


3 
Ma 1 ata * _ik. 
H, aoe Y D abet fupe 2 


x 2 (sites) êa" +j ei Ny YeikTu dV. (55.3) 
$j 


We now pass from integration over the basic region to integration over a 


§55 INTERACTION OF ELECTRONS WITH LATTICE VIBRATIONS 289 


unit cell of the crystal. By means of relation (49.18) we obtain for the two 
integrals involved in (55.3) 


fugpvuetk-K'20-ray= D eik-k'tDn ful VUudV. (55.4) 
G p To 


Making use of formula (49.19), we see that integral (55.4) differs from zero 
only if the equality 


k—k’+f=0 (55.5) 
or, in the more general case, the equality 
k—k'+f=K. (55.6) 


is fulfilled. We shall confine ourselves to the consideration of the first case. 
Formula (55.5) shows that in the process of the scattering of an electron by 
a vibrating lattice a conservation law holds analogous to the momentum con- 
servation law in the collision of free particles. Hence the process of scattering 
can formally be considered as the process of absorption or emission of a 
phonon by the electron. Before the collision the electron had wave number k; 
after the collision it has wave number K' = k + f, where f is the wave vector 
of the phonon. It can be said that the absorption (for k’ = k + f) or emission 
(for k’ = k — f) of the phonon occurs in the transition k > K’. 

We shall see below that in this case the energy conservation law holds, so 
that in the scattering the energy of the electron increases or decreases by the 
value of the phonon energy fic, 


Ey: = E; + hey. (55.7) 
k k f 


This obvious treatment of the process of the scattering of electrons by lattice 
thermal vibrations turned out to be very useful. 

Collisions for which equality (55.6) holds are called Umklapp processes. It 
is easily seen that for small values of f and values of the electron wave vector 
Ikl, ik'| < m/a the Umklapp process corresponds to a change of direction of 
the electron vector into the opposite direction. 

The Umklapp processes do not play any special role in the phenomena of 
electric conduction considered below but are important for establishing ther- 
mal equilibrium in metals, particularly at low temperatures. 








290 SOLID-STATE THEORY Ch. 5 
By means of relation (49.19), we obtain for (55.4) 


Jujreik-0*9-(9U) uy, dV =N J ugsd( VU) uy AV. 


G TO 


Substituting this value into (55.3), we find 


A h 3 * at as 
Hat- ~~ (sats) eg" [fueron a7] Oh 4 £4, Dy + 
To 


+ [ Siua] ithe | = 


TO 





h STA A 
PA ten) Gl, 64g Ss +a} prs)» (55.8) 


where S, denotes integrals over a unit cell 


S= ep Jeker VUukar. (55.8') 
T 


Formula (55.8) again allows an obvious treatment of the scattering of the 
electron as a process in which the absorption or emission of a phonon occurs. 
Indeed, from the meaning of the operators at, â, bt and ô it is clear that the 
matrix element of the operator âl 64.0 5) corresponds to the appearance of 
an electron with wave number k + f and to the disappearance of an electron 
with wave number k and of a phonon with wave number f. Analogously, the 
matrix element at 4,54; corresponds to the appearance of an electron with 
wave number k—f and of a phonon with wave number f and to the disap- 
pearance of an electron with wave number k. Using the property of periodici- 
ty of the potential U and of the functions ug, Ug+f» as well as the Schrodinger 
equation for ug, then on the basis of certain simplifying assumptions the un- 
known potential energy U can be eliminated from S,. Calculations which we 
shall carry out below give for S, 


i2 
S. = 457 E er) fIVul ay. (55.9) 
To 


It follows from (55.9) that the quantity S, differs from zero only for lat- 


§55 INTERACTION OF ELECTRONS WITH LATTICE VIBRATIONS 291 
tice waves with longitudinal polarization. Indeed, 

en, -f=f, ep f=e,,-f=0, 
where index | refers to longitudinal polarization, and indices 2 and 3 refer to 
transverse polarization. Thus, within the framework of the assumptions made 
in calculating S (see below), the scattering of electrons takes place only on 
lattice waves with longitudinal polarization. 


Substituting the value of S, from (55.9) into the interaction energy opera- 
tor (55.8), one can write it in the form 


ME fy nef h 3 J 
ANS p2 aki (ra) (Sivu av) x 
To 


X (af, Abe âk âb h - (55.10) 





For brevity we omit the index j= 1 on the operators Ê. A further simplifica- 
tion is obtained if wp and f are related by formula (47.18) which is valid for 
small f, i.e. if frequency dispersion is disregarded. Then we obtain for Mint 


Ant = —1 2 D ât cy be— âk âh , (55.11) 
where 
D (aa h2 f x 0 
= pee 2 g= — Vu,l- dV. ; 
© 9MNc2 2m 4 k 


In the next section we shall make use of expression (55.11) to calculate the 
transition probability. 

We note that in order of magnitude g represents the mean kinetic energy 
of an electron moving in the lattice. 

In conclusion we have to prove formula (55.9). For this we consider the 
integral 


Jv (epjup'uķU) dV = feguyu,U dS =0, 
S 


To 


which reduces to zero by virtue of the properties of periodicity of the func- 








292 SOLID-STATE THEORY Ch. 5 


tions u, and of the potential U at the surface. On the other hand, this integral 
is equal to 


ff y: (erjuk uķU)dV = 


= epf U[V(ukķug)] dV + eg: f upuy(VU) dV=0. (55.13) 
Thus for quantity S (55.8) we obtain 
S= ey f upug(VU) AV = -eg f Ulv(ugu, J dV. (55.14) 


The potential energy U can now be eliminated by means of the Schrödinger 
equation. Using formula (48.9) for the function u,, we have 


2 
Pe iyu, +2ik-Vu,—(4)2u,] + [E(K)—U]u,=0. (55.15) 
2m k k k k 
For the function uy we write the analogous equation 
h2 * , * 19 * , * 
FA [Vuk + ik’ -Wuys—(k')2 ug] + [E(k’)-— U] u% = 0 . (55.16) 


We multiply eq. (55.15) by the quantity C4; -Vuey and eq. (55.16) by 
€¢;-Vu,,, and add the equations obtained. As a result we easily find 


h2 * 
S= -feg U[V(uju,)] dV = — sad ey" [Vuk V up + Vuk Vuy] dV — 


ih? 
= ies fu) ef: Vuk) dV +- K' f uy (ep Vu) dV — 


-egf (E(k) uy Vuk + EK) uj Vay] dV + 


, 


2 
+k 2 fuper Vuy ara EE k Super: Vu.dV. (55.17) 


2m 

The integration is carried out over a unit cell of the crystal, which is the basic 

A ee is * A A 

region of periodicity of the functions u, and ux’. Hence on applying Green’s 

formula the sum of the first two integrals is converted into surface integrals 
which reduce to zero. 


§55 INTERACTION OF ELECTRONS WITH LATTICE VIBRATIONS 293 


The fifth and seventh terms can be integrated by parts, writing, taking into 
account the properties of periodicity, 


fk Vu, dV = — fru Vu dV. 


Then we finally have 


bat we ih? |, * 
S=—-—k: -f Vule Vu) dV +— a k fv uker Vuk) dV — 


-egf [ ea - E(k’) + a 2-3] uVuydV. (55.18) 


The last integral contains the quantities E(k)—E(k’) and (h2/2m) (k? — k'?), 
which in order of magnitude are equal to the difference between the energies 
of the electron before and after scattering, i.e. are of the order of the phonon 
energy hwy. As we shall see below, the first integral is in order of magnitude 
equal to the kinetic energy of the electron. Hence it is substantially larger 
than the second integral which can be dropped. 

Assuming in a rather rough approximation that u =u% and that these 
functions possess spherical symmetry, we can write the integral over an unit 
cell in the form 


Ou, r Ou, o) Ou, 2. 
Am SEE (er ae tare S (ar) Serer. 
70 To (55.19) 


In integrating, the vector r runs over all possible values, hence it is clear that 
the vector A can be directed only along the unique vector €¢;. We then have 


A = eglAl ó 


The modulus of the vector A can be determined by multiplying it by the vec- 
tor ef; 


2 
1 /0U_ y 
IA| = S 5l) (eg: )24V =} flvuylaV. (55.20) 


To TO 


After the transformations which have been carried out we can now write 


a 


=: 





294 SOLID-STATE THEORY Ch. 5 


the expression for S as 


S= - LE Ges k’). eg flVupl? dV. (55.21) 
To 


Setting k' = k+f and k’ = k—f, we obtain the expressions for S, and S_ given 
by formula (55.9). 


§56. The total Hamiltonian of a solid 


Making use of the results obtained above, we can write the total Hamil- 
tonian of a solid in the form 


Jog ase 2 
H=- Div) O V+ 2U,-R,) + DU(R;-R)) + 


F3 D Da oîk, EK) + 


zy Ir; 2 
+ Leen Gji Di D De podoPt— Âk-soôkobh + 


2 2 
+ = £ (56.1) 


— + —, 
Iri-rjl>Amin Tim] iri- rji<Amin Mi~ Tyl 


Here e(k) is the energy of the electron (quasiparticle) with wave vector k. 
The operators a and a represent the operators of creation and destruc- 
tion of the a (quasiparticle) with wave vector K and spin index ø. The 
energy of the other quasiparticle, a phonon, is e(f) = fiwp. 

The Coulomb interaction between electrons at large distances amounts to 
zero-point plasma oscillations, while at small distances it amounts to the 
screened interaction and the formation of quasiparticles. As to the interaction 
between a vibrating lattice and electrons (quasiparticles), it is formally des- 
cribed by the collision operator between two independent quasiparticles, a 
fermion and a boson. This interaction is the cause of transitions of the elec- 
tron from one state into another. 

The Hamiltonian of a solid in the second quantization representation was 
first obtained by Frohlich. As an example of the application of the mathe- 





§56 THE TOTAL HAMILTONIAN OF A SOLID 295 


matical technique of the second quantization method, we shall find an im- 
portant formula for the probability of transition of an electron from state k 
into state k’ with the emission or absorption of a phonon. In our problem the 
stationary states of the unperturbed system represent states with a definite 
number of phonons nç and a definite number of electrons nggo. The latter 
numbers, by virtue of the exclusion principle, are equal to one or zero. 

If we confine ourselves to the first approximation of perturbation theory, 
then it turns out that only transitions with the absorption or emission of only 
one phonon are possible. Indeed, the perturbation operator involves the oper- 
ators Dy or bt but not their products. The matrix elements of the operators by 
and bi differ from zero only for transitions in which the number of onos 
Pen by one (see (55.3)). 

Let us for definiteness consider the case where the electron passes from a 
state with wave number k and energy E(k) to a state with wave number k’ 
and a larger energy. Then a phonon with wave number f is absorbed, so that 
k' = k+f. Since the total energy of the system (electrons+phonons) must re- 
main constant, the energy of the electron after collision is 


E(k’) = E(k) + hoe. 


In the second quantization representation, there corresponds to the process of 
transition a decrease by one in the number of electrons in state ko and of the 
number of phonons in state f and an increase by one of the number of elec- 
trons in state k’o. 

To find the probability of such a transition, we have to calculate the 
matrix element 


(Ain dk = Mig = 1, Ngo = 9, ne- 11 Aint! ng'o = Qs Meg = Lng- (56.2) 


Applying the rule for calculating the matrix elements of a product of oper- 
ators and making use of the form of the matrix elements of @ a, ays and bi, bps 
we easily find that of all the sum (56.1) only one matrix element OTEN 
Namely, 


Gide = D = Wap lny = 0) (1, = Ola, lng = 1) G1, — \lb dnp . (56.3) 
Using formulae (99.15) of Part V, we see that the product of matrix elements 


of at, a is equal to plus or minus one, depending on the occupation numbers. 
The matrix elements of the operator By are defined by formula (47.17). 


296 SOLID-STATE THEORY Chas 


The probability of a transition with the absorption of a phonon is equal to 





aW _= F Hnd (ECS) -Ek + £) + heap] dar, 
27 
aapa De [E(k) — E(k + f) + hoy] . 
Qr 
If we disregard dispersion and substitute the value of Dy from (55.12), we 
easily find 
4ng? nyf 


-9MNc 


The total probability of the transition of an electron with the absorption of a 
phonon is obtained by integrating over all possible values of f and V. 


W = an sve [Pry PE) —BIK + nE. (56.4) 


The probability of a transition with the emission of a phonon can be found in 
exactly the same way: 


Da 





W, = 4n SNE med fing+ 1) 8{— E(k) + Elk- Daian (56.5) 


Processes with the absorption or emission of one phonon can be presented 


ptf 


Baie sf 


Fig. VI.10 





§56 THE TOTAL HAMILTONIAN OF A SOLID 297 


in an obvious way in the form of a Feynman diagram (fig. VI.10). Here the 
wavy line represents the phonon, and the solid line stands for the electron. If 
one does not confine oneself to transitions of the first order, then other pro- 
cesses (shown in the Feynman diagrams of fig. VI.11 and fig. VI.12), are also 
possible. 











f 
P p+f P 
Fig. VI.11 
P P+f2 
f, f, 

P p+f, 
Fig. V1.12 


An electron may emit a virtual phonon and absorb it on the spot. Because 
of this process the electron turns out to be surrounded by a cloud or jacket 
of virtual phonons with different frequencies. As the electron moves in a lat- 
tice the phonon cloud follows it, changing its mass. A quantitative calculation 
of this effect is beyond the scope of this book. 

Another important second order effect is shown by the diagram in 
fig. VI.12. One of the electrons emits phonon f} and absorbs phonon f,. The 
other electron, on the contrary, absorbs the first phonon and emits the second 
phonon. This phonon exchange leads under certain conditions to a weak at- 
traction between the electrons. In spite of the smallness of the effect, it plays 
a very important role in phonomena occurring in metals at low temperatures 
(see §65). 

It should be noted that the above derivation of the interaction Hamil- 
tonian involves the implicit assumption that, as a result of interaction with a 
phonon, the electron may always pass to a new state. This holds for electrons 
in the upper energy levels of the Fermi distribution. 

In considering kinetic effects, for example electric conduction, we shall be 





298 SOLID-STATE THEORY Ch. 5 


interested in the interaction with just these electrons. However, in studying 
some other properties of metals, such as the effect of the electrons on the 
energy spectrum of a system of phonons, it turns out to be essential to take 
into account the behaviour of the entire system of electrons, including the 
electrons of the filled band. These electrons cannot take part in transitions 
between states inside the band. Hence a description of effects involving the 
behaviour of the entire system of electrons cannot be obtained on the basis 


of the Hamiltonian (55.11). 





The Kinetic Properties of Solids 


§57. The kinetic equation for electrons in metals 


The statement that physical kinetics has achieved its greatest successes in 
the field of solid-state theory and made it possible to understand and describe 
both qualitatively and quantitatively a very large number of diverse and de- 
tailed effects is hardly an exaggeration. Within the framework of this book we 
can consider only the basic results in this rapidly developing branch of 
physics. 

We shall begin by considering kinetic phenomena in solids by finding the 
kinetic equation for charge carriers in metals. We have seen above that ele- 
mentary excitations in a system of electrons in metals can be likened to a 
gas of free fermions. We shall call them conduction electrons and shall des- 
cribe them by a one-particle distribution function /(p, r, £). For brevity we 
shall call the quasi-momentum, p, the momentum. This terminology cannot 
cause misunderstanding. 

As conduction electrons move they undergo scattering on the vibrating 
atoms of the lattice (collisions with phonons) and on all kinds of inhomoge- 
neities of the lattice which are called impurities. Sufficiently good results are 
obtained if this scattering is considered to be elastic (see §58). On the basis of 
what we said in §50, we shall disregard collisions between electrons. 

If an external electromagnetic field is applied to the metal, then the dis- 


299 





300 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


tribution function satisfies the kinetic equation 


MW ago Di n 5. OLS 
aay a P ap Ho (57.1) 


The group velocity v can be written in the form 





= ECP) 
Verran (57.2) 


Within the framework of the statistical description, the change in the quasi- 
momentum under the action of an external force can be taken into account 
in the quasi-classical approximation, i.e. 


DEE (e+! [vx H)). (57.3) 


The collision integral for particles obeying Fermi statistics can be written in 
the approximation of perturbation theory. 
The probability of a transition from state p into state p’ can be written as 


wp=p' = 0 -AP') F(P, P's 2p) 5 (57.4) 


where the first factor gives the probability that state p’ is not occupied. The 
factor F(p, p’, np) represents the probability of a transition into the vacant 
state, due to all the processes of elastic interaction of the electrons with other 
particles; lattice phonons or impurities. 

By virtue of the principle of microscopic reversibility, the probability for 
an inverse transition is of the form 


wp'—p = (1—-f(P)) FP’, P, np) - (57.5) 


We shall come back to the formulation of the kinetic equation in §62, where 
the expression for the transition probability will be refined. For the present 
we shall leave out the dependence of F(p, p’, np) on the occupation num- 
bers np and shall set 


F(p, P',np) = F(p, p’). 


Taking into account (57.4) and (57.5), the collision integral can be written in 





§57 THE KINETIC EQUATION FOR ELECTRONS IN METALS 301 
the form 
1=f (PA -AP -APO — Ap} Ele —') FP, p’) ap’ , (57.6) 


as for collisions between electrons and ordinary atoms. Since the scattering is 
considered to be elastic, we can write 


F(p, p')5(e —e') dp’ = F(a) dQ , (57.7) 
where 
a=(p, Pp’). 


Thus the kinetic equation assumes the form 


ð G) ð, 
E +z lv x H]) U- 


= [PNU — AP) —AP) A —Ap'Y} F(@) AQ.  (57.8) 


Relation (57.8) represents a linearized Boltzmann equation, very similar in its 
structure to eq. (27.4). 

We note that for an uniform system in a stationary state, in the absence of 
external forces, eq. (57.8) turns into 


KDU — Kp’) — fp’) 1 — Ap) = 0. (57.9) 


Taking into account isotropy, where the distribution function depends only 
on the absolute value of the momentum or, which is the same, on the energy, 
the solution of this functional equation is the Fermi distribution 


£ 1 
e(e—H KT 4 1° 


Ke) (57.10) 


The consideration of a large class of kinetic phenomena can be carried out 
without specifying the form of the function F(a). Hence in subsequent sec- 
tions we shall assume F(a) to be a certain given function. We shall find the 
solution of the kinetic equation by the same methods as in the classical kinet- 
ic theory of gases. Then, making use of the expressions for the transition 
probabilities (56.9) and (56.10), we shall find the relaxation time or the free 








302 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


path length. Finally, in order to convince ourselves of the correctness of these 
approximations, we shall, in §63, carry out a more consistent statement and 
solution of Boltzmann’s equation for a metal. 

In the case of weak fields and small departures from an equilibrium state, 
we shall, as in §27, try to find the solution of (57.8) in the form 


F=f) +H, 2.9, (57.11) 
where fo > fi- 
Then for f} we obtain an equation which is the same as eq. (27.10): 
of, yu dfo 
Seve ie ae neat 
of 
e ie ' 7 , 
+é [vx H]: S tve fS DFO, (57.12) 


where € is the electric field. 


§58. The electrical conductivity of metals 


If a constant electric field € is applied to a metal, then a stationary current 
will exist in it. It is necessary to have a clear idea as to why the motion of 
conduction electrons turns out to be stationary. If the collision integral in 
eq. (57.8) could be dropped, then the equation for homogeneous species 
would assume the form 


of Of — 
Stabe ee op zO 


and the distribution function would depend on time. In fact this would mean 
that the electrons were performing an accelerated motion under the action of 
the force eĉ; their velocity, and hence also the current, would increase con- 
tinuously with time. 

The origin of the momentum loss compensating for the accelerating action 
of the field is in collisions between the electrons and phonons or impurities. 

Boltzmann’s equation for the time-independent distribution function can 
be written only by taking into account the collision integral. 

Without reproducing the calculations of §27, we can write for the current 





§58 ELECTRICAL CONDUCTIVITY OF METALS 303 


density the expression 


j=efvfap=-< fa p -(v-€)dp. (58.1) 


m tt ðe 


Thus the electrical conductivity turns out to be equal to 


e2 v; dfo 
o= f Ano Pee OP - (58.2) 


In crystals with cubic symmetry the field Ê and current j are parallel to each 
other. Choosing the direction of the vector € to be the x-axis, one can re- 
write (58.1) in the form 


j=-e 2€ fael) v cos20 dp , (58.3) 


where 0 is the angle between the vectors v and E. 
In formula (58.3) it is convenient to change to integration over energies 
and angles. This gives 


_ 16ne?m 
A ae (58.4) 
~ 3(2mh)3 if z 


We now make use of the property of the function ðfọ/ðe for the Fermi dis- 
tribution in a metal considered in §80 of Part III. 
We have, analogously to (80.5) of Part III, 


dfo j dfo T dfo 
Ly =f nepe de~ PETO J Audet kT) = Pa 
—H, = 00 


where x = (e — u)/kT. 

If the free path length A,, depends on the energy, then, since according to 
the data of §80 of Part III the derivative 0fo/0x behaves as a 5-function, we 
have 


co 


L= I Au 5 Do (ut KT) dx ~ Dy (Wt (58.5) 





304 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
Hence for the electrical conductivity we obtain the expression 


as Jz 3 16re?m 


E  3(2nhy3 





Aul) , (58.6) 


or, on the basis of (80.10) of Part III, 


_ L6TE?MA (H) Emax _ 16Te7 MAH) n2 F (2h Gan 
3(2nh)3 3(2mh)3 2m N8T/ \V I” Ea 





This final formula can also be written in another form by introducing the 
velocity v(u) of an electron having energy € = u. By means of (79.3) of Part III 
we find 


eA) N eA, (i) 
mv(u) van mv(u) ni 


o= 








(58.8) 


We see that the electrical conductivity depends on the number of electrons 
per unit volume, n, as well as on the velocity of an electron at the Fermi sur- 
face, v(u), and on the mean free path of an electron having an energy on the 
Fermi surface. 

The first important feature of the expression obtained for the electrical 
conductivity of a metal is the fact that it turns out not to be proportional to 
the number of electrons per unit volume; in formula (58.8) the velocity, v(u), 
is expressed in terms of n. The cause of this is clear: only electrons whose 
states lie in the upper levels of the Fermi distribution take part in electrical 
conduction. Only these electrons are conduction electrons. Electrons in states 
lyingin the filled band cannot perform a systematic motion and carry current. 

Formula (58.8) for the electrical conductivity contains two unknown 
quantities, the number of electrons per unit volume n, and the mean free 
path of electrons having an energy lying on the Fermi surface, A(u). As we 
have already said, the velocity of electrons v(u) is expressed in terms of n by 
the formulae of §80 of Part III. 

In the next section it will be shown that the quantity n can be defined in- 
dependently of resistance. Then (58.8) allows the mean free path to be ex- 
pressed in terms of the electrical conductivity o. The value of the latter can 
be measured directly. It turns out that the value of A, substantially exceeds 
the interatomic distance (the lattice constant). Furthermore, A depends 
strongly on the temperature. For example, for copper, for which it can be as- 
sumed that there is one free electron to each atom of the lattice, Aş varies 


§59 THE HALL EFFECT 305 


from 7X10-7 cm at T= 1300K up to 4X10~5 cm at T= 100 K, which is 
larger by a factor of 300—2000 than the distance between atoms in the 
lattice. 

It turns out that the temperature dependence of A,,, and with it of the 
electrical conductivity o, differs for the cases of temperatures which are high 
and low relative to the Debye temperature ©. 

At high temperatures T> ©, o ~T. At low temperatures g shows a rapid 
increase, proportional to 1/75, with decreasing temperature. At T=0 in a 
perfectly pure metal g > ©, and the ohmic resistance tends to zero. 

In samples containing impurities, as well as in samples with mechanical 
defects, dislocations, residual mechanical stresses, etc., at very low tempera- 
tures (of the order of a few degrees Kelvin) there is a residual resistance which 
does not depend on the temperature and is proportional to the concentration 
of impurities. 

Thus the electrical conductivity of a metal at 7 #0 has a finite value, and 
the resistance differs from zero. Only at T=0 and in the case of a sample 
containing no impurities and having no defects does a metallic conductor 
show no resistance to the flow of current. Such a conductor is said to be 
ideal. Since it is impossible to obtain a temperature equal to zero, and since 
it is also impossible to produce a completely pure sample, the ideal conductor 
represents a limit to which one can aspire by lowering the temperature and 
purifying the metal. 


§59. The Hall effect 


Important information about the properties of an electron gas can be ob- 
tained by studying the behaviour of metals placed in an external magnetic 
field. 

If a metallic sample in which a current flows (the direction of which is 
taken to be the x-axis) is placed in a magnetic field directed along the z-axis, 
then an electric field Æ, will arise in the sample. 

The origin of this eirectl called the Hall effect, is quite clear. 

Under the action of the Lorentz force the electrons which form the cur- 
rent are deflected in the magnetic field into the negative direction of the y- 
axis. They accumulate “at the lower face of the metal until the electric field 
they produce compensates for the action of the deflecting force. The elec- 
trons subsequently are in a steady state. 

From the above it is clear that the direction of the field By is defined by 
the sign of the charge carriers. We shall also see that its magnitude is directly 


f 





306 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
related to the number of charge carriers per unit volume. These two facts 
make the Hall effect one of the most important methods of studying the 
properties of metals and, as will be seen in the following, semiconductors. 
Let us turn to the calculation of the field Ey. 


We write Boltzmann’s kinetic equation, taking into account that in our 
case the force F has the components 


e 
Fy = eB, +% Avy à 


Then the kinetic equation assumes the form 


e(zx+2 u) 2 ve e(z, -=u) aa (59.1) 


As in §27, we shall follow the Lorentz method and seek the solution of 
Boltzmann’s equation (59.1) in the form 


Ap) = fo(Po) + Pxf\(P) + Py fP) + (59.2) 


since there are electric field components in the directions of the x- and y- 
axes. We then obtain 


I=- = (f; cos 0 + fa sind), (59.3) 
tr 


where @ is the angle between the direction of the momentum and the x-axis. 
Substituting the expansion (59.2) into (59.1), we find to within the sec- 
ond order of small quantities: 


e (2, + t= z =) (e+ RT = 





§59 THE HALL EFFECT 307 


oe) C of, 
=e (Ep12 Je” thi tPs —) ~ 
x ðe xX x apy 


c 


~ (z fy PE +F fi +k i 

AEI TE € Wa xfi + Expy ap, a 
H vyPxH əfi ) ( $ dfo vyH ) 

S TE ap, xe Bn ae o 


The terms of the second order of small quantities E Py Of, /pP,., (v,p,/c) x 
ðfı/ðPy and higher orders are omitted. The term (v,H/c)f, cannot be 
dropped, as will be seen from subsequent calculations. Analogously we find 


=<) ( dfo 2) 
e (e, z apy + fat Py ap, zx% 


dfo UxVy, dfo vH 
~e (Ey e HE- == ata) 


dfo ev H 
~ CB yYy e eT mee 


Hence the left-hand side of (59.1) assumes the form 


of dfo Hf, 
ve (2, 52-24) cosd + ve (E, 224-1) sing ; 


Equating this expression with the collision integral (59.3) and combining all 
terms containing cos @ and sin 0, we find 


afo ı up 
[ve (e, Je -1 np) +2 4] cos + 
fo H 
+ [e(r 52+ 2s) Eg] sind=0. 


In view of the arbitrariness of the angle 0, we obtain two equations defin- 





308 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
ing the functions f} and fọ sought: 


af, 
£p 0 eH 


mx de mef2=~y, FI? 


of, 
e 0, cH 
S= e 
t 


— — + — 
m Y ðe me 


Solving this system gives 


Aye dfo ArH =f ArH a 
Met oe u By) 
Aue fo ArH [ E 
a ae R) suena) | 2 





where 
4, = eH/mce . 
Knowing the corrections to the distribution function, one can find the cur- 


rent in the direction of the x-axis and the field Ey. 
Namely, we have 


jx =e foxfdp= — CEL, -EyLy), (59.4) 
jy=0=ef vy fdp=CEyL, +E xLy), (59.5) 
where 

re dfo 

m= |== e 59.6 

l J + (oop /v)2 de ae) 
AE(AWzy/v) Of; 

n= f ——*_, ) sede , (59.7) 
1 + (Awyu) 9E 

= 16mme? (59.8) 

3(2rh)? i 


§59 THE HALL EFFECT 309 


For brevity we drop the subscript tr of A. It is obvious that for H > 0 
dfi 
L> frezo de and L,>0. 


In this case ją goes over into (58.3). 
To calculate L} and Ly use can be made of the properties of the function 
Ofp/de. In the first approximation 
M(H) wy)?! 
) , (59.9) 


nof (e 


; SE (eee z 1 Aw 
neea ela] 


H 
=o z= L, S910 
v(t) DG) ge OP) 
where A(t) and v(u) are taken at the Fermi surface. 
From formulae (59.4) and (59.5) it is possible to find the conductivity of 
a metal placed in a magnetic field, o(H) =/,,/E,,, and the transverse field £}. 
Let us first find this last quantity: f 


PJA Ly h= La Ix 1 
z pod ale 
GUARDS) a Meaty) 





Ey 


ND on (+ AW wylo) 
= Gin e O 





OH eHjy 


= ouyn T mepo) C ` (59.11) 


Substituting here the value u ~ Emax and u(u) and C from (59.8), we obtain 
after some simple transformations 


Ey =RHjy , (59.12) 


where R denotes the quantity 


R=—. (59.13) 








310 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


Formula (59.12) shows that the transverse field is proportional to the cur- 
rent ją and to the strength of the magnetic field H. 

The factor of proportionality R, called the Hall constant, depends, as is 
seen from its definition, on only two quantities, the charge of current carrier 
e and the number of particles per unit volume. The sign of R and, consequent- 
ly, also the direction of the transverse field £,, are necessarily the same as the 
sign of the current carrier. 

The conductivity in a magnetic field is 


2 2 
_ GED 
ce) Ly L; RH mev(u) R’ 





(59.14) 


Since R does not, in the first approximation depend on the strength of the 
magnetic field, (as regards calculation of the integrals L} and L3) the right- 
hand side does not depend on H. This means that in this approximation 
o(H) = g, i.e. no change of resistance arises in a magnetic field. In higher ap- 
proximations the conductivity o(H) turns out to depend on the field strength. 

A set of measurements of the conductivity o and of the Hall constant 
makes it possible to find the two unknown quantities: 7 and A. Conversely, 
by defining the number of free electrons per atom one can calculate R. 

The sign of the Hall constant is negative when the transport is carried out 
by electrons. This applies to monovalent metals. 

In the case of divalent metals of the transition groups, where band overlap 
takes place, holes as well as electrons take part in the conductivity. Hence the 
sign often turns out to be positive. Anisotropic behaviour of the Hall con- 
stant, particularly pronounced in the case of metals such as Bi, is also ob- 
served. 

We shall come back to the question of the sign of the Hall effect in §68. 

In the next approximation o(H) turns out to be inversely proportional to 
the square of the magnetic field strength in a relatively weak field and tends 
to a constant value in a very strong field. However, in strong fields the behav- 
iour of o(#) and its numerical value are in poor agreement with experimental 
data. This is associated with the crudeness of the model used above. Taking 
into account the effects of anisotropy, which always occur in real crystals, the 
agreement of theory with experiment can be considerably improved. 


§60. The optical properties of a system of conduction electrons 


We shall base the treatment of the optical properties of a metal on the as- 


§60 A SYSTEM OF CONDUCTION ELECTRONS 311 


sumption that the interactions of the electromagnetic field of a light wave 
with the conduction electrons and with the electrons of the atom cores take 
place independently. The interaction of atoms with an electromagnetic field 
was discussed in Part V. 

Therefore we shall confine ourselves to a discussion of the behaviour of a 
system of conduction electrons in the field of a light wave. We note that if 
the frequency, w, of the field is not the same as one of the natural atomic 
frequencies, then the basic optical characteristics of the metal are determined 
only by the behaviour of the conduction electrons. In this case it is simplest 
to find the complex conductivity as a function of the frequency of the field 
acting on the metal. 

As in calculating the electrical conductivity, we shall make use of Boltz- 
mann’s kinetic equation ($8.12), which in our case takes the form 


a ape o eo: 
ec: an Ta a ={G- E (60.1) 


We have retained in it the term ðf}/ðr, but dropped the term with the mag- 
netic field (since it is small with respect to v/e). 
Suppose the external field varies according to 
E= E peik-1-wh) j 
Then it is natural to try to find the solution of eq. (60.1) in the form 


f, =a(v: Eo) eitr- wi) (60.2) 


Substituting into (60.1) gives 
dfo i PRII 
eĉg' y Fe av Eo | i viot] : 


Hence 


eae E fole) ere (60.3) 
hes ml—iw+i(k-v) : 





Since k ~ w/c and v <c, the term arising from the derivative 0f,/0¢ is small 





312 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


and one can write 


ev - E ofod) V! Apei 


1~ 1 —iwr (60.4) 





We see that the correction to the distribution function turns out to be com- 
plex. Substituting (60.4) into (58.1) and reproducing the calculations of §58, 
it is easy to find for the conductivity the expression 


o(w) = 0(0) ee (60.5) 
+ 


= 
or? 


where o(0) is the conductivity in a constant field. Since 7 is known from the 
electrical conductivity in a constant field, formula (60.5) gives a complete 
quantitative description of the optical properties of a system of conduction 
electrons. The formula for conductivity in a high-frequency field is valid in 
the region of the skin depth, which is given by formula (30.4) of Part IV. It is 
necessary that the skin depth 6 be large in comparison with the mean free 


path Àp- 


§61. The photoelectric effect 


When a metal is irradiated by light of a sufficiently high frequency photo- 
electrons are emitted from its surface. This phenomenon, called the photo- 
electric effect, is well investigated and has found well-known practical appli- 
cations. However, a complete and consistent theory of the photoelectric ef- 
fect has only recently* been developed. This theory serves as an illustration 
of a certain general approach to the study of the threshold phenomena in 
quantum mechanics. These threshold phenomena are transitions at energies 
close to the energy threshold of a given process (see §93 of Part V). 

Let us consider a metal contiguous to an optically transparent medium. 
Such a medium could be a solid dielectric, a solution or vacuum. When the 
surface of the metal is irradiated by light of frequency w, a one-phonon tran- 
sition with the emission of an electron from the metal can occur only for 
w > Wp, where the threshold frequency wp is connected with the work func- 
tion W by the obvious relation wg = W/h. We shall confine ourselves to fre- 


* The treatment in this section is based on a study of A.M.Brodsky, Yu.Ya.Gurevich 
and B.G.Levich, Sov. Phys. Solid State 40 (1970) 139. 


§61 THE PHOTOELECTRIC EFFECT 313 


quencies w close to wg and shall seek the probability of emission of photo- 
electrons as a function of the frequency w of the light. 

The photoelectric effect in metals may in principle have two different 
mechanisms: 

(1) The so-called surface photoelectric effect in which photons collide with 
electrons in the surface layer of the metal. The surface layer is taken to be the 
region in which the potential energy varies from the value W inside the metal 
down to zero at its surface. An electron in the surface layer is in a field of 
force varying from point to point. This ensures fulfillment of the laws of con- 
servation of energy and momentum in a collision between an electron and a 
photon accompanied by the emission of an electron from the metal. 

(2) The volume photoelectric effect, taking place in the region of optical 
transparancy of metals. This region usually lies in the ultraviolet part of the 
spectrum. 

In the volume photoelectric effect the interaction of photons with elec- 
trons takes place inside the metal (in the region of constant potential energy). 
The role of the third body ensuring the fulfillment of conservation laws is in 
this case played by phonons or impurity atoms. 

For photon energies lower than 8—10 eV the metals are not optically 
transparent and only the surface photoelectric effect may take place. The 
alkali metals, which are optically transparent in the visible region are an ex- 
ception to this. 

We shall confine ourselves to a treatment of the surface photoelectric ef- 
fect. In this case the energy, £,of the photoelectrons emitted, is small in com- 
parison with hw, so that the photoelectrons can be considered to be slow 
particles. We note that the energy of photoelectrons usually amounts to about 
1 eV, so that it is still large compared to kT. 

We shall assume that the metal occupies the half-space —e° <z < 0 and 
that it has an uniform surface. The electron moves in a potential well with 
potential energy U(z), which for sufficiently large values of z goes over into a 
strictly periodic field inside the crystal. The mean value of the potential ener- 
gy inside the metal is equal to 


U(z) = —W=const. (61.1) 


In the region 0<z <6, i.e. in the surface layer of the medium adjacent to 
the metal, the electron is in a complex, unknown, potential field. 

We shall assume the width of the layer 6 to be small, so that the following 
inequality is fulfilled: 








314 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
pi/n<1, (61.2) 


where p is the momentum of an electron emitted in the direction of the z- 
axis. In other words, we shall consider the electron wavelength A < ô. Finally, 
for z > ô the potential energy in the medium can be written as 


e2 


UD)=— srr 





(61.3) 


The first term has the meaning of the potential energy of an electron in a 
medium at a large distance from the surface (we assume its energy in vacuum 
to be equal to zero), and the second term represents the image force in a me- 
dium with effective dielectric constant €gr. Since the electron moves in the 
medium with an energy considerably larger than the thermal energy, Egg, is 
not the same as the static dielectric constant but approaches rather the optical 
dielectric constant. 

Let us write the Schrödinger equation for an‘electron inside and outside a 
metal. Since the metal is assumed to be uniform in the xy-plane, the unper- 
turbed wave function can be written in the form 


Y = Wo(E, p, z) PiP 


where py and p are the momentum and radius vector in the xy-plane, Æ is the 
energy of the electron, and p is the momentum along the z-axis. 
The unperturbed Schrödinger equation has the form 


hn? a? Py 
erie m CO Wo=0. (61.4) 


When a radiation field which can be characterized by the operator U 4 (z) el! 
is applied, one can write in the first approximation of perturbation theory 


W=Worty’ 
where y’ satisfies the equation 
2 
-Ë Č iuo- +n) +5 |y'=-U@ vy. (15) 
2m z2 2m A 0° i 


Here (E + fiw) is the energy of an electron which has absorbed a photon. 
Making use of the energy conservation law 





§61 THE PHOTOELECTRIC EFFECT 315 
2 2 
E+h po tp, 
eerie © S 


we rewrite (61.5) in the form 
~~ 2-4 U@) + Vo | Y'=- UE) Vo - (61.6) 


Let us consider the solution of eq. (61.6) in the regions z >ô and z < 0. In 
the first region, taking into account the rapid fall of Wo outside the metal, the 
term U, Wo on the right-hand side of (61.6) can be dropped. The energy con- 
servation law 





2 
2 p 
p i] 
= ees A ay : 
2m 2m UE) —hw 
2 
Zia 2 
p i] e 
= + Ei Ei 
ma O EZ hw = const . 


allows one to write (61.6) in the form 


TR One = Pa ate 
2m 9z2 2m 4 oz 





v'(z)=0. (61.7) 


Eq. (61.7) is the same as the equation describing the motion of an electron 
in a one-dimensional Coulomb field. The asymptotic expression we need for 
y' in the Coulomb field when z > ©, corresponding to a particle going off to 
infinity, has the form 


v= Xi re ~ exp [i(pz—nIn(2pz/h) +55)] , (61.8) 
where 

n= 2me? eee P 5 (61.9) 

ôg = arg (1 —in) . (61.10) 


When z > 0, one can use for Y’, formula (38.6) of Part V, setting 


1—e-27™]} pz 
Heath sccm | me ae E£ 
v'~xt o [ = ] E. (61.11) 





316 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


Hence w’(z) can be written in the form 
YVE) = Cp, ~) x(pz/h, n) , (61.12) 


where x is the Coulomb function with the asymptotic behaviour mentioned, 
and C(p, w) is a constant indépendent of z. To define it, (61.12) must be 
matched with the solution inside the metal. As was done in §93 of Part V, 
use must be made of the condition (61.2). Instead of carrying out the match- 
ing for z=6, we shall match (61.12) with the solution inside the metal for 
z=0. The wave function of a particle with a wavelength à > 6 has no time to 
change substantially over the length 6. Examining the region z < 0, we note 
that inside the metal the potential energy U(z)+VW is very large in comparison 
with the quantity p2/2m. Hence for z < 0 the term p2/2m in the Schrödinger 
equation can be dropped. As a result the wave function inside the metal turns 
out to be independent of the quantity p. 

At the point z = 0 one has to write the condition for matching the func- 
tion (61.12), which depends in general on p, with a function independent of 
p. It is clear that the condition can be fulfilled for all values of p, if for z > 0 
the y’ defined by (61.12) also does not depend on p. 

For this one has to set 


_ {ff 2n We 
Cp, w) = [| Ko), (61.13) 
where f(%) is a function independent of p. Thus for z > œ we obtain 
’ 27 à A Jh 
(irs || ees (61.14) 
| —e—2mn 


Knowing the wave function, one can find the photoelectric current. Namely, 
the probability current density is 


L 
"oA *o | ey 2Eo 2 nes 
jaa YYW E= (a) [1 —exp(— EVEIA), (61.15) 
m 
where Ey = 33.5/e2,,eV. 
The total photoelectric current from unit surface of the metal is given by 
the formula 


T=e | if) P(E, Py) 0p) dE dp, = 











§62 THE MEAN FREE PATH OF ELECTRONS IN METALS 317 


2E\+* > 7 EI 1 
= e= (E—p)/kT)-1 _e—Eo/E): n 
e (25) J {ite uUe ] dEX 


[2m(E+ V—ħw)) } 


x f p(E, pı) 27lp;l dip yl (61.16) 
0 


where f(E) is the Fermi distribution, and p(E, p) is the number of states 
with given Æ and p. The chemical potential u is connected with the work 
function and the potential Vg by the obvious relation u = —(W— V9). 

Assuming p(£, py) to be a slowly varying function and taking it out of the 
integral sign, we have 


co 


T= p(E,pye f KE)S(EAE {27 \p\l dlpyl - (61.17) 


w 


Calcuiations for the two cases of emission of electrons into a vacuum and 
into a medium lead to the following results. 
In the first case the energy of the photoelectrons is £ < Ep, and 


I~ (w—w)?. (61.18) 
This law for the dependence of photoelectric current on frequency (Fowler’s 
law) is in good agreement with experiment. On the other hand when electrons 
are emitted into a dielectric or into a solution of an electrolyte for which 
Eeft ~ 10, E > Eq andj ~ E+. In this case 

I~ (w—p)?. (61.19) 


This frequency dependence has been well confirmed by experimental data 
for the photocurrent at a metal—solution interface. 
§62. The mean free path of electrons in metals 


As in §27, in the approximation (57.11) one can introduce the concept of 
the mean free path A,, 


$ 
r 
i 








318 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


ce! 


No, W’ 





Ar 


where vą is the mean velocity of the electron, and W is the probability of its 
collision (per unit time) with the scatterer. The quantities vą and W have a 
direct meaning. Of course, the mean free path cannot be understood literally 
as the distance between successive collisions. 

As has already been mentioned above, in a metal collisions of electrons 
with phonons and with impurities in the lattice may occur. We shall denote 
the corresponding mean free paths by App and Ajmp. It is obvious that the 
electrical conductivity and other kinetic quantities are determined by the 
value of the smaller of these two mean free paths (i.e. by the more probable 
collision). As will be seen from subsequent calculations, this mean free path 
is usually App- Hence we shall restrict our considerations with a calculation of 


"The total probability for an electron to undergo a collision with a phonon 
is, obviously, defined by the sum of the probabilities W= W,+W_, i.e. by the 
sum of the probabilities of collision with the emission and the absorption of a 
phonon. The quantities W, and W_are given by formulae (56.4) and (56.5). 
For an actual calculation of W it is necessary to make the following simplify- 
ing assumptions concerning the quantities wp and E(k) involved in these 
formulae: 

(1) We shall assume that for all frequencies 


w= cf, (62.1) 


where c is the velocity of sound, independent of f. This assumption cannot al- 
ways be justified, particularly for high temperatures. However, it does not 
lead to any substantial error in the final result even in those cases where rela- 
tion (62.1) is not accurately fulfilled. 

(2) We shall assume that £(|k|) is a quadratic function of |k| 


E(|kl) = ak? , (62.2) 


where a= ħ?/2m*. This assumption is valid for electrons moving in an almost 
filled or an almost empty band, i.e. for strongly bound and for nearly free 


electrons. 
Let us first consider the case of temperatures which are high in comparison 


with the Debye temperature ©,. Then in formulae (56.4) and (56.5) one can 
set 





§62 THE MEAN FREE PATH OF ELECTRONS IN METALS 319 


a Oe 
a aes 9 
na Ff fl (62.3) 


Indeed, the number of phonons is defined by the Bose distribution 
ng= [exp (ħcf/kT)— 1]7!. 


For high temperatures the exponent may be expanded in a series. We then 
get formula (62.3). For the probability W we now obtain 


2n 2g°(kT) 
w= > ô(E E, + hef) + 
h call a 
+ô(Ek+f— E,—hef)| E Lr (62.4) 


A 


We shall carry out in detail the transformations with the delta-functions, 
which are of methodological interest. 
Let us consider the first integral 


i = [dE 1 Ek + hef) df . (62.5) 


In a metal at high temperatures, because of the degeneracy of states, actually 
only electrons in the spread-out region of the Fermi distribution are mobile. 
The energy of the electrons is Ea ~ Emax > KT, where Emax is the Fermi 
level; the wave numbers of these electrons are k ~ k max = 7/4. 

All phonons up to those with wave number f= fmax = 7/4, the latter being 
represented in the largest number, are excited in the lattice. The energy of the 
phonons is Awp < A(wf)max ~ KT. Thus the wave vectors of the electrons and 
the phonons are quantities of the same order. On the contrary, the energy of 
the phonons is much smaller than that of the electrons. When an electron col- 
lides with a phonon the electron energy changes relatively little, while its 
wave vector changes considerably. This means that a considerably change in 
the direction of motion of the electron takes place in each collision . Inequali- 
ty (62.3) is automatically fulfilled for all phonons; it is weaker than the in- 
equality f< fmax: 

Passing to the calculation of J}, we write 


1, = f d{a(k — f)?— ak? + hef} f?af sin 0 d0 dy = 


=2n f 5(— 2akf cos0 + af? + hef) f2df sin 0 dO . (62.5') 








320 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


We carry out the integration over the angle by introducing the new varia- 
ble u = — 2akf cos 0 + af? + ficf. Then 


u2 
i 
= 2 

T aff aT df 5(u) du , 

1 
where 
uj = —2akf+ af? + ħef (which corresponds to 8 = 0), 
Uy = 2akf + af? + hcf (which corresponds to 6 = 7). 


It is obvious that 


He 0, if, and u, have the same sign , 
f 6(u) du = 


ity 1, ifu; and u, have different signs . 


The value of u, is essentially positive. Hence the value of the integral 
1, #0, if 


Uy = —2akf + af? + hicf <0 
or 
f<2k—(he/a). (62.6) 


Since ES kmax = mja is always true, inequality (62.6) does not impose any 
restrictions upon the integration over f. By virtue of this we obtain 


* 2 
T 2 _ 7” fmax 


v= Jak! max hk 


Exactly the same calculation gives for the integral 73 
h = f 8Ekar- Ek- ħef) df =7;. 


Substituting the values of 7; and /, into the expression for W, we find 





§62 THE MEAN FREE PATH OF ELECTRONS IN METALS 321 


A 8n°g (KT) Vm f2 x 


“OMN(2T)? hke? 


We introduce the electron velocity v= ħik/m*, substitute fmax = 7/4 for fmax 
and set the volume of the crystal equal to Na3. For the mean free path we 
then obtain the expression 


Ve 9Mc*v2n? 
Aa Sima” (62.7) 
P W ng-(kT) a 


It is useful to estimate numerically the quantities involved in Apn- 

The velocity of sound c is usually of the order of 2X 105 cm/sec. The ener- 
gy of an electron near the Fermi surface is Emax ~ PmaxUet ~ (2/4) Ye = 
2—3 eV. Hence vą is of the order of 108 cm/sec. The quantity g, defined by 
formula (55.12), is in order of magnitude the same as the kinetic energy of a 
nearly free electron. But the energy of nearly free electrons is close to the 
Fermi energy (since mobile electrons are in the spread-out region of the 
Fermi distribution), so that g ~ 2—3 eV. Finally, for metals of medium atom- 
ic weight Mc? ~ 1 eV. Consequently, in order of magnitude 





7 3Mc2 (=) 1 Ave, Shva Se (62.8) 


IT A TIR KA Ye 
For a Fermi energy Emax ~ 2eV and room temperature (kT = 0.025 eV) 
Aph = 200-250 a . 


This value of Ap is in good agreement with experimental data. 

As is seen from (62.7), Aph ~ 1/T. Other quantities involved in (62.8) are 
in practice constant for a given metal. 

Let us now consider the case of low temperatures kT < ©,. Then the ener- 
gy of excited phonons is Aw ~ kT, and their wave numbers are 


hw -T 
f ch <tmax= 7° 


The energy of the electrons lies near the Fermi surface, i.e. is, as before, of 
the order of E ~ Emax: The wave numbers of the electrons are k ~ k max ~ 
nja. 





— SS 


— 


322 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


Since k > f, the energy conservation law for the collision between an elec- 
tron and a phonon can be written in the form 


k+p2—/? 2 Eee 
F = f) ae olin w= 0 





Hence it follows that 


cos 0 = —Ê = ~L< 1, 
(hk/m*) Vel 





i.e. 0 ~ 47. This means that the phonon is emitted perpendicular to the direc- 
tion of motion of the electron. The latter is deflected at a small angle & — the 
angle between the vector k and the vector k+f. In order of magnitude the 
angle 3 is equal to 


ay RD WD S 


ies AK Hicfmax R@max Og 





The change of the electron wave vector in each collision is of the order of 


T 2 
Ak ~ k cos 9 — k ~ bk0? ~ k (Z) : 


c 


Since each collision with a phonon leads to only a small change in the 
direction of motion, the electron must undergo a large number of collisions P 
to be scattered at a large angle. P is defined by the relation 


P-Ak~k 
or 


P~(@,/T)?. (62.9) 


Let us now find the probability of scattering with a small change in the en- 
ergy and momentum of an electron colliding with a phonon. The probability 
of collision with the absorption of a phonon is found very simply 


4ng2V V 


nf? f? dfsin@ d0 0 dy _ 
W= 9MN JZ ay OO ħwp— Jira A 


(27)? 





§62 THE MEAN FREE PATH OF ELECTRONS IN METALS 323 





Sr2e2v p nP af 2 
| EUS RUA f A2 y » 
9MNe J (2m è ( Taea no) sin 0 dd 


fmax 











I if Peay e l aata Nefmax/kT 2247 
T OMNch?k ecHkT 1 n(ħe)? 9MNch?k 0 e-1 
Since 


hef, max NO max ©, 
= *=-£>], 
kT r al 





the upper limit of the integral can be replaced by infinity. Then 





_ 1 Adi (SZ) if z2dz 
Z 9m Meh?k \he e1’ 


This last integral is calculated in Appendix IV (vol. 2): 


f Aa: 


W eil 


Hence 





w m EAE E)E ( kT Nee cx 
T  QugMhitet 9 \Mc?/ Bde \Cfmax/ M 


SOE 
9 \Me2?/ \hva/ \hcfmax/ 4 a3 GNON 


In calculating the probability of transition with the emission of a phonon 
it is necessary to take into account the fact that for the electron a transition 
into an occupied state is forbidden. This means that the electron cannot emit 
phonons with an energy exceeding the width of the spread-out region of the 
Fermi distribution. 

If hwp= ħef > kT, then, by emitting the corresponding phonon, the elec- 
tron would be obliged to pass into an occupied state. Hence only the emission 
of phonons with wave numbers f< kTj/ħc is possible. Taking this fact into 





i 
y 
' 
f 








324 THE KINETIC PROPERTIES OF SOLIDS Ch, 6 


account, we have 








+ 


4ng2V JE df sin 8 dé dy 5 (tar cose 


nef) zW. (62.10) 
9MN(27)3 c ehickT _ | É 


m 


The mean free path of an electron before the first collision is 


v v, e,)\3 
a) _ Yel (=) (=) 3 
oh w~ iNe rT): (62.11) 


Besides the mean free path between two collisions, one can introduce the 
mean transport free path A which takes into account the effectiveness of the 
collisions. The corresponding cross section is defined as 


on= f o0 — cos 0) dQ ~ f $062 a2. 


The transport mean free path A, i.e. the mean free path before an effective 
change of momentum, is defined in classical kinetics as A = 1/No,,. In our 
approximation we introduce the quantity A,,, defining it by the relation 


Vel ave (O-\> 
dy slp el (Ze) ; (62.12) 


It increases very rapidly, as 1/75, with decreasing temperature. At very low 
temperatures, of the order of a few degrees, Ay is very large compared to a 
and reaches macroscopic values. 

Knowing the mean free path, one can find the electrical conductivity by 
formula (58.8). 

The electrical conductivity of metals o is expressed by the formulae 


3 ehon” 2 9ne?Mc hwn 





o for high temperatures , (62.13 
mv nmmg*(kT) a p ) 
ne?hħMc?va (O, \> 

o = A ——_—— |=] for low temperatures , (62.14) 

mene NET. 


where A is a numerical coefficient. 

All quantities involved in (62.13) and (62.14) are known and can be found 
from independent measurements. Agreement between theory and experiment 
turns out to be good. This is somewhat unexpected, taking into account the 


§62 THE MEAN FREE PATH OF ELECTRONS IN METALS 325 


large number of simplifications made in the calculations and the fact that the 
effect of collisions between electrons was neglected. It should be noted that 
for T>0O the electrical conductivity of real (non-superconducting) metals 
does not increase up to infinitely large values but tends to a constant limit in- 
dependent of temperature. This is the so-called residual resistance, due to the 
scattering of electrons by impurities and non-uniformities. Its value is deter- 
mined by the concentration of impurities and is obtained by substituting Ximp 
for Aph in the formula for o. 

Knowing the mean free path, one can find the electronic thermal conduc- 
tivity of a metal. For high temperatures it is expressed by the formula 


>] 
of z= n ~ const (62.15) 
max 


a ies 
Ko ~ VeA phy ~A 


and does not depend on the temperature. 

It is obvious that for T> ©, the so-called Wiedemann—Franz law Toļk = 
const holds. For low temperatures the thermal diffusivity is determined by 
the mean free path App. Indeed, for a single collision with a phonon the ener- 
gy of an electron changes by an amount of the order of the phonon energy, 
i.e. of the order of kT. This change of energy corresponds to the possible 
amount of energy transport by electrons. The transport of large energies is 
impossible since the electron cannot give up an energy considerably exceed- 
ing kT. Hence the mechanism of energy transport by electrons amounts to 
transfers of energies of the order of kT in each collision with a phonon, which 
corresponds to deflections of the electron at small angles. 

The thermal conductivity of electrons for kT < ©, is expressed by the 
formula 





~ rel, 0) AL 
Ka © Aon Cyr Ni 


el’ p Emax el 


no~ D d (62.16) 
T2 

where a is the mean free path before the first collision, which is given by 

(62.11) 

It is clear that in this case the Wiedemann—Franz law does not hold. The 
ratio of electrical conductivity to thermal conductivity turns out to depend 
on the temperature. We cannot dwell in this book on other phenomena asso- 
ciated with the electron gas in metals, and refer the reader to the specialist 
literature*. 


*See footnote on p. 326. 





326 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
§63. The collision integral for electrons in a metal 


We now pass to a logical treatment of the kinetic equation and to the 
proof of the existence of a mean free path for electrons in a metal. For this 
we must first of all obtain the expression for the collision integral. We shall 
take into account only collisions between electrons and phonons. 

Let us consider a certain state k of an electron. The electron may leave the 
given state k in two ways: by absorbing a phonon f or by emitting a phonon 
f, i.e. 
k+f 


E 


ki 
The number of electrons leaving the state k in unit time by emitting a phonon 
is equal to 


®(Ik|) (1 — (Ik — fl) dw(k >k- f), 


where dW ‚(k > k — f) is given by formula (56.5), and the factor (1 — ®(|k — f])) 
is introduced to take into account the Pauli principle. Because of this factor 
no electrons pass into the occupied state for which (k — f) = 1. 

The number of electrons leaving the state k by absorbing a phonon is equal 
to 


p(k) -Pk + f)dW (k>k+f), 


where dW_(k > k + f) is defined by (56.4). The total number of electrons 
leaving the state k per second is obtained by integrating the sum of the above 
expressions over all possible values of f, i.e. 


I_=f {&(k) (1 -9(k-f)) dW,(k > k-f) + 


+ &(k)(1—#(k + f)) dW (k>k+f)}. 


* See, for example, R.Peierls, Quantum theory of solids (Clarendon Press, Oxford, 
1966); A.H.Wilson, The theory of metals (Cambridge University Press, 1953); ASSommer- 
feld and H.Bethe, Handbuch der Physik, Vol. 24/2 (Springer, Berlin, 1933); C. Kittel, 
Quantum theory of solids (Wiley, New York, London, 1969); J.M.Ziman, Principles of 
the theory of solids (Cambridge University Press, Cambridge, 1964). 


§63 COLLISION INTEGRAL FOR ELECTRONS IN A METAL 327 


The integration is carried out over all values of |f| and over all orientations of 
the vector f. 

Let us write down the number of electrons arriving per unit time into the 
state k as a result of the emission and absorption of phonons by electrons in 
other states. Reasoning analogous to the above gives 

I, =f {b(k + f) (1 —&(k)) dW,(k +f > k) + 
+ &(k—f)(1 — (k) dW (k—f>k)}. 
Setting up the balance of electrons, we can write 
aH (eo) ie 
Substituting the values of /, and /_ and of the transition probabilities dW, 


and dW_ involved in them, according to the formulae (56.4) and (56.5) we 
can write the integral /,,) in the form 


Teon = an z (2 ) ev f EUo 1) O(k + H) (1 — (kK) (Ex yp 
— By — hess) +n (1 — (kK) Pk = f) 5(Eq_p— Er + wp} — 
—{(np+ 1) (k) (1 — ®(k— f)) EEk- Ey_¢— hwy) + 

df 
+ nP(k) (1 — ®(k +A) Ey Ekart ho 


Combining processes with the same delta-functions and substituting coy into 
the kinetic equation, we obtain 


ab, ab, F db_4nV 
par ROOK ah (i ii) s of 


X {[O(k + f) (1 — PK) (np 1) — 0K) (1 — &(k + f) nd X 


X (Epe Ek- hes) + [nfl — 0(k)) (k-i) 








328 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


df 


—(ngt DIKA -Pk ~ f))] 5 Ey + hop} z. (63.1) 
(27) 


Here in combining the terms we have made use of the even nature of the 
delta-function 


S(Exag— Ek- ho) = 6 (Ey — Eye t hoy) - 


The kinetic equation represents the quantum analogue of eq. (57.8). Compar- 
ison with the classical equation (57.8) shows that: (1) the transition probabil- 
ity containing the unknown cross section is replaced by known relations for 
dW, and dW_: (2) the Pauli principle is taken into account by introducing 
factors of the type (1 —P(k)) into the number of transitions, while the fac- 
tors np and (1 + nf) take into account the statistical weights of states of the 
system of phonons. 

Let us first of all convince ourselves of the fact that in an equilibrium 
state the distribution function of electrons P(k) =p represents a Fermi— 
Dirac distribution, and that the distribution function of phonons n¢= npg re- 
presents a Bose—Einstein distribution. In an equilibrium state d@/dr= 0, 
v = 0, F=0, so that 


mO (63.2) 


It is easily seen that condition (63.2) is fulfilled if each of the brackets in the 
integrand of (63.1) reduces to zero, i.e. 


Polk + f)(1 — Bo(k))(npt 1) — Polk) (1 — Po(k + f))np= 0, 
(63.3) 
E(k + f) = E(k) + hwy, 


p(k — f) (1 — p(k) np— tk) — &o(k — f)) (np + 1) = 0, 
(63.4) 
Elk- f) = E(k) — hsp. 


The solution of the functional equations (63.3) and (63.4) with the sub- 
sidiary relations expressing energy conservation is carried out according to the 
usual scheme. One then obtains 


OAS malign i n heals ts 
OF 5m kT ey” f hu/kT _ 1° 





§64 A SOLUTION OF THE KINETIC EQUATION 329 


Knowing the equilibrium distribution functions of electrons and phonons 
and assuming the departure from the equilibrium state to be small, one can 
solve the integro-differential equation (63.1). 


§64. A solution of the kinetic equation 


In what follows we shall confine ourselves to the calculation of the elec- 
trical conductivity of a metal for high temperatures T> ©, where © is the 
Debye temperature. In a time independent uniform electric field the kinetic 
equation can be written as 


e€ a 


Tope (64.1) 
We shall seek its solution in the form 

b= d+’, (64.2) 
where the correction to the distribution function is equal to 

P'(k) = ©, (ikl) k, = © (|kl) k cos 8 , (64.3) 


J is the angle between the directions of the vector k and of the field E. 
In calculating the integral it is necessary to make certain assumptions 
about the form of the functions £(k) and Wr. We shall assume that wre cf, 
where c is the velocity of sound. 
Further, we shall assume that the electron energy is of the form 


E = ok?, (64.4) 


as in the case of nearly free electrons and strongly bound electrons near the 
boundary of a zone. 

We transformi the collision integral under these simplifying assumptions, 
substituting into it œ as defined by formula (47.16). In carrying out the in- 
tegration we introduce polar coordinates with the polar axis directed along 
the vector k. 

Let us write the collision integral. In order to abbreviate the formulae we 
shall confine ourselves to its first part which contains the delta-function cor- 
responding to the emission process 


Í 
f 
f 








330 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


T coll 


4k ( hi 
MN 


= MOT ia) gV +h), 
ULT 


maf L leyk +f) + OK + D] [1 9k) — PK )M(ry+ 1) - 


— [Po(k) + &'(k)] ] 1 — Pok + f)— &'(k + f)] ny} X 
X ô(Ek+f— Ek - hopp df sin 8 dé dy. (64.5) 


Further we shall make use of the equilibrium conditions (63.3) and (63.4), 
and shall disregard the products of the functions ®’(k+f) ’(k). We shall sub- 
stitute into the delta-function the expression (64.4) for the energy 


I = [Etok + f) [1 — Po(k)] (apt 1)— P'(k) Bo(k + f) (np+ 1) — 


— $'(k) [1 —®,(k + f)] nt &'(k + f) &o(k) ny} X 
X 5(2akf cos 0 + af? — hef) sin 0 dé dy df. 


In what follows we shall restrict ourselves to the case of high temperatures. 
For high temperatures n-> 1. Under this assumption one can make the 
substitution 
nt es nf, 
ng=kT/hef . 
Let us now write the expansion (64.2) in more detail: 
(Kk) = P(E) + kg P{(Ly) = Ho(L,) + K cos 9 ,(E,) , 
P(k + f) = BoE xs 5) tke PiEktr) + fe tE) = 
= P (Ep t hw) +k cos V (Ey + hare) + f cosy (Ey + hw) . 


Here kg and fg are the projections of the vectors k and f onto the direction 
of the electric field &. We then find 


§64 A SOLUTION OF THE KINETIC EQUATION 331 
= [Puce 
= pe {(k cos) + f cosy) P (Ey) — 
he? 


—k cos 3 &,(E,)}6(2akf cos 0 + af? — icf) sin @ dé dydf. 


Since Ey p= Ey tho and the electron energy is of the order of Emax, we can 
set 


Pi (Exes) >P Ep. 
Hence 


| h= a (Ey) f fcos75(2akf cos0 + af? — hef) sin 0 d0 dy df. 





To carry out the integration over angles, we express cosy in terms of cos@ 


and cos? 

cos y = cos 3 cos@ + sind? sin 0 cosy. 
Then 
hess hen fh (cos @ cos + sind sin 8 sin y) X 


X 5(2akf cos0 + af?— hcf) sin 8 dd dy df= 


= 2nkT 


T 
he P (Ep) cos 3 J 7:08 0 5(2akf cos 6 + af? — hef) sin 8 dé df. 
c 
0 


In integrating over y the term proportional to sin y reduces to zero. The 
integration over the angle @ can be carried out by introducing the new 
variable 


2akf cos 0 + af2—fhcf=u. 


Under this transformation we obtain 


ug 
Qntk D4 m n ia Oe 
I= oe ae aED fea f (Gare Jokf ALOLT 
uy 





332 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


As we have already seen in §62, 
ug +00 


f 8) du = f oq@)qu=1. 


uy GH 


Furthermore, since for all possible values of f, wu, and u, have different signs 
ug +00 
f ud(u) du = ff ud(u)du=0. 


uy °° 


Taking these equalities into account, we obtain finally 





fmax 
ED SOLED FP (te af) papo 
ie ak f 2ak ~ 2ak) N 








3 
_m™kT) cos? (Ean fme) g (Ey) 
yo Dale \ Bele TT aes 


Calculating the second part of the collision integral gives 
=k 
=f oy thy — &(k) &(k — f) — (ng + 1) (Kk) X 


X (1—0(k—f))]} 5(Ey_p— Ey + hws) df = 





nya) 
| _m™kT) cos 0 ( RCo ax = Zina ) (E) 
hc2 2ak 3ak 4k Daki 


By virtue of (64.5) and the expressions for 7} and J}, we find the expression 
for the collision integral 


2 2 74 
Tas UCL) S 0 / Vi 
Lo) = — —>————— (uw) cos Y ®,(E£,) . 64.6 
coll he 2n)3 42 N 16 9) ( ) 


We now transform the left-hand side of the kinetic equation. Assuming the 
force F acting on the electron to be weak, we have 








$64 A SOLUTION OF THE KINETIC EQUATION 333 


ab dbo JE, BC ab 
Roce 0 k, dk _e& cos), C°0 


— æ = 


VO. dip SEE CS Me ae IFS 





Equating this last expression and Zeo we obtain the value of the correction 
to the equilibrium distribution function 


p =—e& 





IDy (etd (7 ) E Tm (64.7) 
Ek \ nT Vif ; 


Comparison of (64.7) with (27.11) shows that 


14407k4(Mc?) nN 
Aph = E S 





F (64.8) 
KT 8° Vf thax 

In accordance with (80.13) of Part IH the value of the electron energy is as- 
sumed to be equal to the Fermi energy ak? =u E hna where u is the partial 
potential of the equilibrium distribution of the electrons. We shall further 
show that the result obtained for App is in agreement with formula (62.8). 





For this we write the expression for Noh in the form 
2 5 5 
` E = aniy eee aoe 
ph ENEIT gill) T Tieamie Iie 


This value’ of App is in very good agreement with the qualitative results ob- 
tained in §62. 

The solution of the kinetic equation for the case of temperatures lower 
than the Debye temperature can be found in an analogous way. Then the 
same expression as (62.12) is obtained for A. We shall not dwell on these 
rather cumbersome calculations. The theory is in good agreement with experi- 
ment for metals having a crystal lattice with a high symmetry. 

Effects of anisotropy turn out to be very important in metals having a lat- 
tice with low symmetry. The difference between the tensor components oj, 
in different crystallographic directions sometimes turns out to be consider- 
able. To construct a theory of kinetic phenomena taking into account ani- 
sotropy it is necessary, first of all, to study the character of the Fermi surface 
in more detail. 

In the case of metals with a relatively small amount of impurity, or in the 
case of alloys with a small concentration of one of the components or 


334 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


ordered solid solutions, the approximations of the theory turn out to be valid. 
However, in the case of strongly alloyed metals, disordered solid solutions 
and liquid metals, the basic assumption of the theory, the existence of a regu- 
lar periodic structure, is not fulfilled. 

As we have pointed out in §39, the electrical conductivity of such systems 
is calculated by means of the mathematical apparatus of time correlation 
functions. 


§65. Superconductivity 


In the preceding sections we have considered certain features of metallic 
conductors. Our considerations would be incomplete if we did not discuss, at 
least briefly, the phenomenon of superconductivity. The macroscopic theory 
of superconductivity has already been discussed (§21 of Part IV). 

There has recently been success in constructing an adequate microscopic 
theory of superconductivity, which we shall outline here. 

For some time it has been assumed that there is an analogy between the 
phenomena of superfluidity and superconductivity. That is the current in a 
superconductor, which is not weakened by ohmic resistance, has naturally 
been thought of as a superfluid current of electrons in a lattice. 

As was explained in §5, superfluidity arises in a system of particles if the 
energy spectrum of its collective excitations satisfies certain requirements. 
These requirements are not related directly to the statistics of the particles 
constituting the system. However, until recently the spectrum of collective 
excitations satisfying the condition of superfluidity could be obtained only 
for an imperfect Bose gas. Qualitatively the reason for this can be understood 
by the following reasoning which allows the difference between the Fermi 
and Bose systems to be elucidated. 

The particles of a Bose gas in the superconducting state form a condensate 
accumulating in a state with momentum equal to zero. Repulsive forces be- 
tween the particles ensure the appearance of collective motion in the system. 
The excitations have an energy spectrum satisfying the condition of super- 
fluidity (5.23). To small impulses of excitation of the system there corres- 
ponds a small energy. 

In the case of a system of electrons, an ideal gas of fermions, the situation 
is essentially different. In such a system the condensation of particles in mo- 
mentum space is impossible. The particles successively occupy lower quantum 
states down to the Fermi level. The appearance of a very small excitation in 
such a system means that one of the particles leaves a state at the Fermi sur- 





§65 SUPERCONDUCTIVITY 335 


face and goes over into an unoccupied (excited) state. Two unpaired ‘particles’ 
appear in the system: an electron in an unoccupied state and a hole, with mo- 
menta close to the momentum of particles at the Fermi surface, i.e. with mo- 
menta having a very large absolute value. 

Thus in a system of fermions the condition of superfluidity |v| < €/p for 
small e and large p is not fulfilled if the value of the velocity is appreciable. 
The electrostatic interaction between fermions cannot change this situation. 
As has been said, condensation is impossible in a system of interacting 
fermions. 

Electrons in a metal form a gas of fermions interacting with each other ac- 
cording to the Coulomb law, i.e. they undergo mutual repulsion. It seemed 
incomprehensible that a system of electrons could move in a metal without 
interacting with the crystal lattice of the metal. 

One of the important stages in understanding the nature of superconduc- 
tivity was the discovery of the isotope effect (see §21 of Part IV). From the 
existence of the isotope effect it followed that the interaction of electrons 
with lattice vibrations plays an important role in the phenomenon of super- 
conductivity. 

In this connection it is interesting to note that superconductivity does not 
arise in systems possessing good conductivity, for example in metals of the 
first group of the Mendeleev periodic system. On the contrary, metals which 
at ordinary temperatures have a considerably larger resistance, possess the 
property of superconductivity. Thus the phenomenon of superconductivity is 
observed in metals with a relatively strong interaction between the electron 
gas and phonon gas. 

We have already mentioned in §56 that the absorption and emission of 
virtual phonons by electrons leads to a certain effective interaction between 
the electrons*. In 1956 L.N.Cooper published a note in which he pointed out 
that owing to the existence of a weak interaction (attraction) between elec- 
trons they may form certain bound states; electron pairs. These pairs possess 
an integer spin and, roughly speaking, the set of them can be considered as a 
Bose gas, which at low temperatures may possess the property of super- 
fluidity. Proceeding from the concept of pairs, J.Bardeen, L.N.Cooper and 
J.R.Schrieffer developed a theory of the phenomenon of superconductivity**. 
In its most strict form the theory of this phenomenon was given by 


* H.Frohlich, Phys. Rev. 79 (1950) 845; J.Bardeen, Phys. Rev. 80 (1950) 567; L.N. 
Cooper, Phys. Rev. 104 (1956) 1180. 


** J. Bardeen, L.N.Cooper and J.R.Schrieffer, Phys. Rev. 106 (1957) 162. 


336 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


Bogoliubov*. We shall first present the qualitative picture of the formation of 
electron pairs, following the obvious derivation of Cooper. 

Let us consider the motion of two electrons. We write the wave function 
of the system in the form 


WR, r) = z eK- y(r, k), (65.1) 


2 


where in correspondence with (14.14) of Part V the wave function of the sys- 
tem is equal to the product of wave functions characterizing the relative mo- 
tion and the motion of the center of mass. Here R = 3(r, + r3), r =r} — ry, 
and K is the wave vector of the system as a whole. 

We write the wave function of the relative motion in the momentum re- 
presentation 


v= 22 a, elk, (65.2) 
0 


We normalize the plane waves to a volume V and impose conditions of peri- 
odicity upon them. Because electron states with an energy lower than the 
Fermi energy are occupied, the summation in (65.2) is correspondingly 
bounded from below. Of course, such a bound from below on the summation 
is not a strictly consequential situation. In fact, it is necessary to consider the 
many-electron problem. Cooper reduces this problem to that of two interact- 
ing electrons on the background of the occupied Fermi sphere. The back- 
ground electrons are taken into account in the form of a restriction on the 
summation, i.e. in expression (65.2) one sets a, = 0.for k < kọ. 

The Schrödinger equation for two particles in the momentum representa- 
tion has the form (see (48.12') of Part V) 


(8x + €,-E)a,+ 2 ay KIH'Ik’^=0, (65.3) 
where 
maali —ik-r yy! piker 
(k|H'Ik") 7 J are H'e 


* N.N.Bogoliuboy, V.V.Tolmachev and D.V.Shirkov, Fortschr. der Physik 6 (1958) 
605. 





§65 SUPERCONDUCTIVITY 337 
and 
&k= h2K2/4m , ek= h2k2/m : (65.4) 


As has already been mentioned, the absorption and emission of phonons 
by electrons leads to a certain effective interaction between them. This inter- 
action is analogous to the Coulomb interaction in electrodynamics due to 
photon exchange. 

The exchange interaction might be obtained by means of Frohlich’s Hamil- 
tonian (55.11). However, the derivation based on Froéhlich’s Hamiltonian is 
complicated. Hence we shall consider a simplified Hamiltonian making it pos- 
sible to correctly draw basic qualitative conclusions and requiring no complex 
calculations. 

We assume that the energies of interacting electrons in the pair lie in the 
interval Ae = €,, — €9 = hk, - ka)/2m above the Fermi surface Emax» SO 
that we set 


G\H'Ik)=—F if ky <k,k'<k,, 


righ ; hee 2 (65.5) 
(k\H |k)=0 if k ork lies outside 


the region mentioned . 


Here F is a constant, and (Pi? |m)(k?, = ke) ~ hw ~ 0.2 eV (fiw is of the 
order of a certain effective phonon energy, and €Q =}?kg/m is the energy of 
an electron at the Fermi surface). 


We substitute expression (65.5) into eq. (65.3). We then have 


=, kkn (65.6) 


and a,= 0 for k lying outside the interval mentioned. Further, we calculate 
the sum 2, ay. It is equal to 


2, a= R 2 EEE (65.7) 


Hence we obtain 





apee 





338 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


Em 


E NK, e)de (65.8) 
€0 


Here, the energies €g and €, correspond to the momenta ž#ko and #k,, and 
we have passed from a summation to integration over the corresponding ener- 
gy interval. V(K, €) denotes the density of two-electron states with energy e 
and total momentum K. Eq. (65.8) defines the energy of the system £. Owing 
to the narrowness of the energy interval in which the interaction takes place, 
the density of states V(K, €) can be replaced by V(K, €,). After this the inte- 
gration may be carried out in an elementary way. Solving the equation ob- 
tained for £, we have £ = Ey = Ek + €y —A, where 
Em — € 


A= (65.9) 
e — 


and 
B=MCK, €o) F. (65.10) 


It is obvious that the state found with energy £ = Ep corresponds to a bound 
state. Indeed, eq. (65.8) has a solution only for a definite value of Æ. It can be 
seen that the continuous spectrum corresponding to the disintegration of the 
pair is separated from the bound state energy level by a gap of width A. The 
quantity A depends strongly on the density of states V(K, €g). The function 
N in its turn depends strongly on the total momentum K of the pair. To illus- 
trate this dependence, let us examine fig. VI.13. The centres of the Fermi 
spheres of radius kg of the two electrons are separated by the value of the 
total momentum K of the system. The value of ôk corresponds to that region 
in momentum space in which the interaction takes place. Since the total mo- 





Fig. VI.13 





865 SUPERCONDUCTIVITY 339 


mentum of the electrons equals K and each of the momenta of the individual 
electrons must lie in the region 5k, it is obvious that the volume of the 
hatched region is proportional to the number of states N(K, €9). Hence it is 
seen that N(K, €o) has a sharp peak for a total momentum equal to zero, In 
this case the spheres are matched and the volume of the hatched region be- 
comes a maximum. This corresponds to the fact that electrons with opposite 
momenta are actually bound into pairs. Thus the bound state of a system of 
two electrons arises even when there is a very weak interaction. We note that, 
in general, two microparticles can form bound states only when there is a suf- 
ficiently strong interaction. In §37 of Part V we have shown that a level cor- 
responding to a bound state can be formed in a spherical potential well only 
under the condition that the depth of the potential well is larger than a cer- 
tain critical value ((37.9) of Part V). In the present case a bound state can be 
formed when there is an arbitrarily small interaction (small F). Formally this 
is due to the fact that the summation in (65.2) is bounded by the condition 
k > ko. Physically this is associated with the effect of background electrons. 

Thus in a metal pairing of electrons may take place. The electron pairs 
have an integer spin and obey Bose statistics. An imperfect Bose gas possesses 
the property of superfluidity. Since the pairs are charged, the superfluid mo- 
tion of electrons corresponds to the appearance of superconductivity. We 
note that the function (65.9) cannot be expanded in a series about the point 
B= 0 and, consequently, no calculations based on perturbation theory could 
lead to an explanation of the phenomenon of superconductivity. 

The phenomenon of electron pairing is particularly clearly shown in the 
behaviour of doubly connected (hollow) superconductors in a magnetic field. 

Let us consider a hollow superconducting cylinder placed in an uniform 
external magnetic field directed along its axis. We shall be interested in the 
wave function y,(R) characterizing the motion of the centre of mass of an 
electron pair in the magnetic field. We need not take into account here the 
internal motion of the components of the pair. 

Since the spin of the pair is equal to zero, and the charge and mass are 
twice as large as for one electron, the Schrödinger equation for the pair can be 
written in the form (see §102, Part V) 


W zt 2e 2 
am [2 v- z Ao | Yy = Eyy- (65.11) 


We note that when a magnetic field is applied, a superconducting current 
completely screening the external magnetic field arises at the surface of the 


OOO eee 


| 








340 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
superconductor. Hence inside the superconductor 
H=vxX A=0. 


This means that the vector potential A in a hollow cylinder can be written in 
the form 


A=Vx(r), (65.12) 


where x is the scalar potential of the magnetic field, which is an ambiguous 
function of the coordinates. 
The magnetic flux through the cavity in the cylinder is equal to 


@=[H-dS={ vx A-dS=fA-dr=[yx-dr= dx, (65.13) 


where Ax is the jump of the function ® in going round a closed contour in- 
side the superconducting cylinder. 
Making use of expression (65.12), eq. (65.11) can be rewritten in the form 


1 |À 2e Zapi oe 
Aga [2 y 229 (0) | $y = Egy . (65.14) 
It is easily seen that the wave function, yy, of the pair in the magnetic field is 
expressed in terms of the wave function, Yo, of the pair in the absence of 
magnetic field by the relation 


yy = pge Ziehhedx@ = A eiK -r-(2ie/hc)x(0 , (65.15) 
Of course, the wave function yy must be a single-valued function of coordi- 
nates, as must the wave function yọ. 

Let us in imagination displace an electron pair inside the superconductor 
along an arbitrary closed curve. In such a displacement neither the function 
Yo nor the function yy can change. However, according to (65.15), the am- 
biguous function x(r) changes by an amount Ax in going round the closed 
contour. In order for the wave function yy not to change when x increases by 
the amount Ay, it is necessary that the following equality be fulfilled: 


2e 2eP 
— = —=7 
ic Ax PA 27n , 


§65 SUPERCONDUCTIVITY 341 


where n= 0, +1, +2, ... Thus the magnetic flux through the cavity of the 
cylinder can run over the discrete series of values 


Ad = me 2Qnn , (65.16) 


which are multiples of the quantity fic/2e and depend on the charge (2e). The 
total magnetic flux through the cavity is obtained by multiplying A® by the 
number of pairs in the volume of the superconducting cylinder. 

The fact of quantization of the magnetic flux through the cavity of the 
cylinder, as well as its dependence on the pair charge (2e), have been experi- 
mentally confirmed. 

It is useful to supplement the above qualitative consideration of the phe- 
nomenon of superconductivity by a more consequential discussion of the 
properties of a particular simplified model described by the Hamiltonian 
J+A 2 (âpi a_ 


H=2) [e(p)—u] Ga, + pi tat at), (65.17) 


a, a, 


where 
2 
= [2 = 
e(p) 2m Lid 


is the energy of an electron of momentum p, u = p?/2m is the chemical poten- 
tial, and po is the momentum at the Fermi surface. The first term of the 
Hamiltonian (65.17) represents the Hamiltonian of a system of non-interacting 
particles outside the Fermi surface. The second sum characterizes :the interac- 
tion of electrons outside the Fermi surface with the condensate, i.e. with elec- 
trons in occupied states. This interaction leads to the creation or annihilation 
of pairs of electrons with oppositely directed momenta and spins. 

For reasons which will become clear from what follows, the quantity A, 
called the energy gap, represents the work of removing a pair from the con- 
densate to the Fermi surface. 

In choosing the Hamiltonian (65.17) we have made use of the fact that 
the number of particles in the condensate is large. Hence, just as was done in 
§5, instead of the four-fermion interaction operator figuring in Frohlich’s 
Hamiltonian we have introduced simplified two-fermion operators: the opera-: 
tor âpt Ê p, describing the destruction of two electrons, one with momentum 
p and spin up and the other with momentum (— p) and spin down, and the 
operator a â! pi describing the creation of such a pair. 


342 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


We carry out a linear transformation on the Hamiltonian (65.17), passing 
from the operators å and ât to new operators & and â}, analogous to that 
carried out in §S. 

Namely, we set 


Apt = Upp, + vai, , (65.18) 
Ty =n Gt ae 
ay u pÊp: v_p: 5 (65.19) 


Here u, and Up are transformation coefficients which are assumed to be real 
c-numbers, so that 


Up= up) vp = 


Furthermore, the coefficients Up and Up are assumed to satisfy the condition 


us +u = le (65.20) 


The coefficients up and vp are not completely defined by formulae (65.18) 
and (65.19), and we can subsequently impose upon them one more condition 
at will. 

It is easily seen directly that transformations (65.18) and (65.19) are ca- 
nonical, i.e. that the new operators @ and âf, as well as the old operators â 
and at, satisfy the Fermi commutation relations. By substituting into the 
Hamiltonian (65.17) and transforming it to the new Fermi operators, we ob- 
tain, taking into account (65.20), 


H=2 2) [e(p) —u] v — 2A 25 Uv, t 
+ DAL le(p) —H] (u2 —v?) + 2Au popl (â pi tÂ â p) + 
+2 {le(p)—u] 2u pvp — Alu? -= 03) (a$ âh, +a, 4,,)}. 
We now require that the coefficients up and vp satisfy the condition 
2[e(p) —u] u v. —A(us — v2) = 0. 
P P p 
Then the expression for H is essentially simplified and assumes the form 


H=E,+ D EPA â p tata), (65.21) 


AR 





§65 SUPERCONDUCTIVITY 343 
where 
Eo=2 27 {[e(p)— u] v? — Au vp}, (65.22 
E(p) = [e(p)— u] (u? — v?) + 24u Vp: (65.23) 


The Hamiltonian (65.21) has an obvious interpretation. It corresponds to a 
gas of elementary excitations. The elementary excitations possess energy E(p) 
and are associated with the motion of two fermions with momenta (p) and 
(— p). The value of E(p) can be expressed in terms of e(p), u and A, if eqs. 
(65.22) and (65.23) are solved for Uy and Up: 

Obviously, from (65.22), (65.21) and (65.20) we have 


J- e(p)- u 25 mee) 
uy 3 (: URN ) 5 vp } (1 ep) , 


Substituting the values of Uy and Up into (65.23), we obtain for E(p) 
E(p) = (le(p) — ul? + AP} . (65.24) 


Thus the minimum energy of the elementary excitations (for e(p) = p) turns 
out to be equal to A. A gap of width A separates the energy of an elementary 
excitation from the energy of particles at the Fermi surface. Near the Fermi 
surface, E(p) can be written in the form 


E(p) = (v°%(p — pp)? + A?) (65.25) 


As we have seen in §5, the gas of elementary excitations with an energy spec- 
trum given by formulae (65.21) and (65.25) possesses the property of super- 
fluidity. In our case the motion of the elementary excitation is connected 
with the charge transfer. Therefore the gas of elementary excitations has the 
property of superconductivity. 

As was shown in §5, the system of particles can move as a whole without 
any dissipation if the inequality 


lv*]= min $2) > 0 (65.26) 


is fulfilled. In our case this inequality is just fulfilled if the quantity A is posi- 


344 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


tive: from formulae (65.24) and (65.25) we have 
A A 


aa 
Ppl + A*/v2p2)? Pe 


(65.27) 
Therefore if the system of electrons moves as a whole with velocity less than 
A/pp such motion will be nondissipative. In other words electrons will move 
through the crystalline body without any resistance. 

It may be shown that the width of the gap A is dependent on the temper- 
ature and decreases as the latter increases. For some temperature T the width 
of the ‘gap’ A tends towards zero. Here the property of superconductivity 
disappears and a metal becomes a normal conductor. 

It is necessary to emphasize that besides the effective attraction between 
electrons the usual Coulomb repulsion exists. If the latter dominates the sys- 
tem of electrons the electrons cannot be in a superconducting state. 

Thus we see that all conclusions about the property of isolated Fermi sys- 
tems remain the same. Such a system cannot have the property of supercon- 
ductivity. A system of interacting electrons which possesses the collective 
states and collective motion only arises in a crystalline body where a mixture 
of electrons and photons exists: under certain conditions such a system can 
exist in a superconducting state. 

It should also be noted that the interaction via pair exchange of phonons 
is apparently not the only type of interaction between electrons which is res- 
ponsible for superconductivity. This is indicated by the fact that there are 
superconductors (ruthenium, osmium) which do not display the isotope ef- 
fect. However, from the preceding calculations it is seen that the theory of 
superconductivity we have is not very sensitive to the detailed character of 
the interaction forces (attractive forces) between electrons which is responsi- 
ble for the appearance of the property of superconductivity. 


§66. Theory of the Fermi fluid 


A few years ago Landau put forward a phenomenological theory of the 
Fermi fluid, in which the existence of a strong interaction between fermions 
is assumed from the very beginning. This theory subsequently obtained a 
statistical substantiation which, however, is too complex to be presented in 
this book7. 


* A.A. Abrikosoy, L.P.Gorkoy and I.E.Dzyaloshinskii, Methods of quantum field theo- 
ry in statistical physics (Prentice-Hall, Englewood Cliffs, N.Y., 1963). In presenting the 
theory of the Fermi fluid we follow this book. 


§66 THEORY OF THE FERMI FLUID 345 


The theory was applied to the description of the behaviour of the liquid 
helium isotope 3He, whose nuclei have spin 4 and obey Fermi statistics. If the 
assumption is made that the energy spectrum of a system of electrons in a 
crystal lattice differs relatively little from the spectrum of the electron fluid 
filling the corresponding volume, then it applies equally to electrons in metals. 

The theory is based on the assumption that no matter how strong the in- 
teraction between the particles may be, it cannot violate the exclusion prin- 
ciple. Hence the occupation numbers of energy states in a liquid, as well as in 
a gas, can be equal only to either zero or one. This means that in a Fermi 
fluid at absolute zero all energy levels are occupied up to the Fermi boundary 
surface. 

The energy distribution has the character of a step function 


s e>€p, 
1 E<E€f. 


f= 


The energy and momentum at the Fermi surface are connected with the 
number of particles by the relation (see §79 of Part ILI) 
p) 
PE (2nh)* ( 3N Ji 
Tam 2m \BnV) ` Ge) 


At a temperature different from zero but sufficiently low, collective excita- 
tions arise and the step-wise distribution is somewhat smeared. If the temper- 
ature is sufficiently low, then it can be assumed that the energy of these exci- 
tations is close to the limiting energy €p. 

The appearance of excitations is always accompanied by the formation of 
vacancies (holes) in occupied states within the Fermi surface. The disappear- 
ance of excitations is associated with the occupation of vacant states, the an- 
nihilation of ‘holes’ and ‘particles’. Hence excitations appear and disappear in 
pairs. 

If the excitation energy is measured from the Fermi surface and the exci- 
tations are assigned a definite value of momentum, then they can be treated 
as a pair of quasiparticles; properly speaking a quasiparticle and a hole. 

The excitation energy for small excitations can be written in the form of 
an expansion in terms of a small parameter (p — pp) 


e=ep+ (35), 2-2»: (66.2) 


f 
| 





346 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


Analogously, the energy of a hole (measured, as always, downward from 
the Fermi surface) is 


—e=en+ (35) (Pp —P)- (66.3) 
Pipo 
F 

In accordance with the general propositions of the theory of quasiparticles, 
the two forms of particles are equivalent and their properties can be described 
by the quantity 


p 
(3°) oe (66.4) 
P/ p=po m 


where m* is the effective mass. 
At low temperatures the width of the region of energies of excited parti- 
cles is 


een Ski. (66.5) 


For the approximation of free quasiparticles to make sense, it is necessary 
that the concept of momentum makes sense, i.e. that condition (47.19) be 
fulfilled. 

Since excitations in the Fermi fluid may appear and disappear only in 
pairs, the number of encounters of a pair ‘particle’-—‘hole’ is proportional to 
N2, where N is the total number of excitations at a given temperature. 

At low temperatures N increases proportionally to T. Correspondingly, the 
lifetime of the excitations is 


SEN ti 

“M T 

Hence Ae, the uncertainty in the energy of the quasiparticle, is, according to 
(66.6), equal to 


(66.6) 


Ae ~filr~ ñT? . (66.7) 


Comparing (66.7) and (66.5) we see that for a sufficiently low tempera- 
ture we always have 


(€—€p) > Ae, (66.8) 


and the concept of momentum for independent quasiparticles makes sense. 








§66 THEORY OF THE FERMI FLUID 347 


Thus quasiparticles in a Fermi fluid have momentum and effective mass. 


They appear or disappear only in pairs in rare collisions and, consequently, 
obey the exclusion principle. 


The properties of elementary excitations enumerated allow one to treat 
them as a Fermi system described by the distribution function 


ee oe 
fle) = exp [(€—e,)/KT] + 1° (652) 


We shall normalize the distribution function of quasiparticles by the condition 


ar _N 
[ro Gane (66.10) 





where N is the number of real particles of the fluid in volume V. 

However, and here lies the essential difference between a Fermi liquid and 
a Fermi gas, the energy € of a given quasiparticle depends on the density n(e) 
and, consequently, also on the temperature. 

In the approximation of the self-consistent field each quasiparticle moves 
in the field produced by all the other quasiparticles. Hence, if the distribution 
function of the excitations changes by amount ôf, then the energy of the 
quasiparticle will change correspondingly. This change can be written as 





fem dp’ 
de= 5f(p' 66.11 
J Fe. pnp) T (66.11) 
where the operator Ê(p, p’) describes the change we are interested in. (For 
simplification of the formulae we do not write here the dependence of the 
operator F on spins.) This operator, introduced formally, plays the principal 
role in the theory of the Fermi fluid. Formula (66.11) shows that the energy 
e is a functional of the distribution f. 

The properties of the Fermi distribution determine the temperature de- 
pendence of the thermodynamic characteristics of the ideal gas. Namely, 
using the formulae of §80 of Part III, we have 


: € m A [ Sm ( kT y] 
E= | ———_ — = 3’, | 1 + =— = 66.12 
J e(e-HY/KT + | (27h)? SSE 12 \ & ( ) 


and 


kT 


Cy = kN TE 





(66.13) 





348 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


The linear temperature dependence of the heat capacity is associated solely 
with the step-like character of the Fermi distribution. 

The kinetic properties of a Fermi fluid turn out to be more remarkable. 
According to (66.6) the mean free path of quasiparticles is 


À SE 66.14 
VET T2 ( al ) 


and increases rapidly with decreasing temperature. This fact determines the 
temperature dependence of the kinetic coefficients which increase with de- 
creasing temperature. For example, the viscosity in order of magnitude is 


] 
n~ (m*vp) ni~ 7 (66.15) 


and, correspondingly, the thermal conductivity is 
K~ (Cyn) oA ~g (66.16) 


The large mean free path of quasiparticles makes the propagation of ordi- 
nary sound in a Fermi fluid impossible. As we have seen in §26, for wave- 
lengths exceeding the mean free path length, very strong damping of sound 
waves occurs in the medium. 

However, it turns out that periodic perturbations of high frequency 
w > 1/7, having a character essentially different from that of ordinary sound 
waves, may propagate in a Fermi fluid. 

Let us write the kinetic equation for the non-equilibrium distribution 
function 


W a no arno i = 

art V TE ap I (66.17) 
and substitute in it 

eee (66.18) 


iors 
Then the kinetic equation assumes the form 


Oi Oh 06 OF = 
u ramet (66.19) 





§ 66 THEORY OF THE FERMI FLUID 349 


The collision integral is /~ 1/7 and for sufficiently low temperatures is 
small. In other words, at sufficiently low temperatures collisions between 
quasiparticles become so rare that they do not affect the distribution function 
much. 

Under these conditions the transmission of perturbations in an ideal gas is 
stopped. As can be seen from (66.12), the interaction existing in the Fermi 
fluid leads to a change in the energy of quasiparticles as the distribution func- 
tion changes. This is a specific form of the long-range action between particles 
in a Fermi fluid. Owing to the long-range forces, a perturbation of the distri- 
bution function arising at a particular instant of time propagates throughout 
the fluid. 


In the kinetic equation we set 
f= fo(6)+ fp) . (66.20) 


where fo(€) is the equilibrium distribution, and f} is the perturbation. 
Substituting (66.20) into the kinetic equation, we find 


of; af, fo de de 
Bet Gene eae. (66.21) 


We note that in an equilibrium fluid ðe/ðr = 0. Hence in the last term we have 
retained only small quantities of the first order. 


Since the equilibrium distribution fọ is of the form of a step function, we 
have (see § 80 of Part III) 


of 
507 êle er). 


The perturbation f} also must be proportional to this delta function. We shall 
seek it in the form 


fi ~ o5(€— ep) elk) , (66.22) 


Choosing the vector k to be the polar axis and assuming that œ depends 
only on the angle @, we obtain 


i(vpk cos 0 -w)at+v: E= 0. (66.23) 





t 
j 
b 
. 
' 


350 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


Making use of (66.11), one can write for the change of energy de/dr 


) fi? dp” 
e På 
jen J App ae (20 (27h)? =ik f Fp, PAP) f= ny” (89-24) 
In substituting (66.24) into eq. (66.23) one has to take the values of mo- 
mentum at the Fermi surface, so that the integration over dp is transformed 
to an integration over angles. We thus obtain 


kup 


aoe ae f Fa(6') d= 0 (66.25) 


(kvp cos 8 — w) a(0) + 


The dependence of F on the angle is unknown. If it is assumed that F does 
not depend on angles at all and that it is a constant, then 





fd 
sF foo" )dQ= const =A 
3(2nh)3 
and 
(,2 —cosa) a(@) =A cos 0 
kup 
or 
a(@)= A cos o/(¢ — cos a) à (66.26) 


Substituting this value of & into (66.25), we obtain the equation for defin- 
ing w/kv=u/v, where u is the velocity of propagation of perturbations 


—1 

uvy + 1 

uv! th ( E ) =] +i- 1+ CL (66.27) 
uvg! —1 2nFpi 





From this expression it is seen directly that the velocity of propagation of 
perturbations can be real only if it exceeds the velocity of particles at the 
Fermi surface vp = p/m”. Then, as is seen from (66.22), the perturbed dis- 
tribution function turns out to be distended in the direction of propagation 
of the perturbation. 


fee RP SL ee d o M M 


§67 ELECTRONS IN DIELECTRIC CRYSTALS 351 


The result obtained, as may be shown by a more detailed treatment, is of 
a general character and is not associated with the assumption that F = const. 
Taking into account the angular dependence of F leads to an even more asym- 
metric distribution of the perturbed function. 

The propagation of the perturbations considered is a specific effect, asso- 
ciated with the interaction of quasiparticles. This effect is called ‘zero sound’, 
As is seen from the foregoing, zero sound represents an essentially non- 
equilibrium process. A more complete calculation shows that the propagation 
of zero sound is accompanied by a rapid damping over a length ~ ur. 

The existence of ‘zero sound’ has recently been completely confirmed by 
experiment. The propagation and damping of zero sound excited by oscilla- 
tions of the walls of a container were observed in liquid helium 3He filling the 
container. The experiment confirmed all the basic conclusions of the theory. 

The properties of the electron Fermi fluid turn out to be in many respects 
similar to those of a Fermi fluid of neutral particles. This applies to the gener- 
al character of the onset of excitations, to the form of the energy spectrum, 
etc. At the same time the presence of the Coulomb interaction and the inter- 
action with lattice phonons leads to the appearance of essential, although 
quantitative differences. The theory of the electron Fermi fluid is too com- 
plex and insufficiently developed to be discussed within the framework of 
this book. 


§67. Electrons in dielectric crystals 


As we have already stressed, the difference between metals, semiconduc- 
tors and dielectrics is associated mainly with the different character of the 
electron energy level spectrum. 

In dielectric crystals the empty band is separated from the filled band by a 
broad (of the order of one or several electron volts) band of forbidden 
energies. 

Suppose that the absorption of light takes place with the excitation of one 
of the lattice atoms in a dielectric. Since the crystal possesses translational 
symmetry and the wave function of the excited state Y„ must satisfy the con- 
dition of translational symmetry 


TY = On Wy » (67.1) 


the excited state cannot be localized at a definite atom. On the contrary, it 


352 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


must move across the crystal and represents an excited state of the crystal as 
a whole. 

The electron wave function describing the crystal in an excited state can 
be written in the form (see eq. (49.11)) 


v= Veikry,, (67.2) 


where y is the symmetrized wave function corresponding to the excitation of 
the jth atom in the lattice. The wave function (67.2) satisfies both the Schro- 
dinger equation (for proper choice of £(k)) and the requirement of transla- 
tional symmetry (67.1). 

We see that a collective excitation, an excitation transmitted in a lattice 
from one atom to another, can be considered as a quasiparticle with quasi- 
momentum p = ñk. This quasiparticle is called an exciton. All that was said 
in §47 on the properties of quasiparticles applies to the exciton. Since exci- 
tons appear and disappear one at a time, they must be considered as particles 
obeying the Bose—Einstein statistics. 

In contrast to phonons, which represent collective excitations of the vi- 
brational states of the nuclei of a lattice, excitons represent an excited elec- 
tron state. However, the displacement of an exciton in a crystal is not asso- 
ciated with the motion of an electric charge. Herein lies the difference be- 
tween an exciton and a charged quasiparticle, an electron or hole in a metal or 
semiconductor. The energy spectrum of an exciton e(k) can be found in two 
limiting cases. 

The first of these is the case of the so-called exciton of small radius or the 
Frenkel exciton. To such an exciton there corresponds an excited state of the 
crystal which is like an ordinary excitation in an isolated atom. This means 
that the excited electron is in the main localized near a certain atom of the 
lattice. Owing to the interaction between atoms, the excitation is transmitted 
to neighbouring atoms and thus migrates through the crystal. The Frenkel 
exciton can be pictured simply as an electron—ion residue pair whose radius is 
small in comparison with the lattice constant a. 

The wave function y; of an exciton of small radius is defined by the 
Schrodinger equation with the Coulomb interaction potential 


u=—e2/r (67.3) 


between the electron and ion residue. 
The energy spectrum of an exciton of small radius can be found in the ap- 
proximation of interaction with nearest neighbours. 








§67 ELECTRONS IN DIELECTRIC CRYSTALS 353 


The corresponding calculations do not differ from those of §49 and lead 
to an expression for e(k) which is in principle the same as (49.13) (we recall 
that the specific form of formula (49.13) refers to an electron in the s-state; 
this assumption is often unjustified for excited states). The spectrum of exci- 
tons has a band character. At the edges of the band the energy is connected 
with the momentum by the relation 


h2K2 


2m” 





€,(k) ~ 5 (67.4) 


where m? is expressed in terms of the exchange integral of interaction be- 
tween the atoms, and n is the number of the band. In order of magnitude the 
effective mass of an exciton m“ is close to the analogous quantity for strongly 
bound electrons. 

The other limiting case is that of an exciton of large radius or the Wannier 
exciton. The production of an exciton of large radius corresponds to the ioni- 
zation of an isolated atom. This means that the excitation energy is sufficient- 
ly large to bring the electron from the filled into the vacant band of the crys- 
tal. Such a transition is accompanied by the simultaneous appearance of a 
hole in the filled band. The electron and hole interact with each other accord- 
ing to the Coulomb law in a medium 


e2 


WO aa 67.5 
Ecole — Tpl (re) 


where Eoo is the dielectric constant of the medium. Somewhat later we shall 
define what is meant exactly by Eoo. 

Formula (67.5) with the macroscopic constant €oo means that the interac- 
tion of the electron and hole is described by the averaged field of a rather 
large number of atoms and, consequently, that the distance between the com- 
ponents of the pair is rather large in comparison with the lattice constant. 

It is the interacting electron—hole pair in a crystal that is called an exci- 
ton of large radius. The Schrodinger equation for an exciton, i.e. for a two- 
particle system, can be written in the form of (14.5) of Part V. 

Introducing the radius vector of the center of mass of relative motion, its 
solution can be written in the form (see (14.14) of Part V) 


Y = yo 





354 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


where y satisfies the Schrodinger equation for relative motion 


h2 7) e2 ) ie, 
(- Vr EF prey, (67.6) 


and y satisfies the equation of motion of the centre of mass in the crystal 


h? 2 e 
-myrt U) V= EtV. (67.7) 


Eq. (67.6) is the same as the Schrödinger equation for the hydrogen atom, 
in which Z = 1/€ə. Its solution has the form of the hydrogen functions y with 
energies €„, which in the region of the discrete part of the spectrum are given 
by formula (38.17) of Part V 


e=- Tuei, (67.8) 
2R ean? 
The solution of the Schrödinger equation for the motion of the centre of 
mass is the function (67.2). 
The total energy of an exciton forms a band. At the edge of the band it 
can be written in the form 


PE PEA (67.9) 
2U 226, n2 


Thus an exciton of large radius moves in the crystal as a whole and at the 
same time has an internal degree of freedom. 

A remarkable feature of an exciton of large radius is just the presence in 
its energy spectrum of a hydrogen-like term corresponding to the internal 
degree of freedom. In the spectrum of absorption of light in a dielectric, dis- 
crete levels corresponding to the hydrogen spectrum must appear. This effect 
has been observed in a number of crystals, and can be considered as the most 
direct proof of the existence of excitons of large radius. The size of a large 
exciton, ry, is defined by the value of the Bohr radius for Z = 1/€co, i.e. is 
given by the formula 


2e 
Ww 





pe? — 


§68 ENERGY SPECTRUM OF ELECTRONS IN SEMICONDUCTORS 355 


For the concept of the exciton of large radius to make sense, it is necessary 
that the following inequality be fulfilled 


——>a. (67.10) 


-We note that besides discrete states, eq. (67.6) has continuous spectrum 
solutions. These states correspond to the disintegration of the exciton into 
an electron and a hole moving independently of each other in the crystal. In 
contrast to the motion of an exciton, in this case not only energy transport 
but also charge transport (photoconductivity of the dielectric, if the excita- 
tion is produced by irradiation) arises. 

From the above it is clear that the difference between excitons of large 
and small radii is of a somewhat relative character. Intermediate cases are pos- 
sible, where the size of the exciton is comparable with the lattice constant. 

We can now discuss what value of dielectric constant of the medium should 
be inserted in formula (67.5). The electron and hole rotate about their com- 
mon centre of mass with a velocity very large compared with the velocity of 
motion of heavy nuclei. At the same time their velocity is comparable with 
the velocities of valence electrons. Hence valence electrons have time to be- 
come polarized, and their polarization follows the motion of the component 
of the pair, whereas the nuclei remain at rest. This situation corresponds to 
the action of a high-frequency electromagnetic field on the crystal. Hence Eso 
should be understood to be n?, the dielectric constant at high frequencies. 

In conclusion we note that the concept of an exciton has no meaning for 
metals. In metals the lifetime, Ar, of excitons turns out to be so small that the 
concept of a quasiparticle makes no sense: 


h i 
Ae~ A~ En: : 


Superconductors, having a gap in the energy spectrum, could be an exception 
to this. 


§68. The energy spectrum and the distribution function of electrons in 
semiconductors 


Semiconductors, which in their electronic properties occupy an inter- 
mediate place between metals and insulators, have assumed very great impor- 
tance in modern physics and technology. 





356 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


The electronic properties of semiconductors, as well as analogous proper- 
ties of metals and insulators, are defined in the first place by the character of 
their energy spectrum. Let us consider a body whose lower energy band is 
completely filled, at absolute zero, whereas the upper one is completely empty. 
We assume, however, that the spacing Ae between the filled and empty bands 
is very small. In such a body there is a certain probability of exciting an elec- 
tron and of its transition into the empty band. At a temperature T > Ae/k 
the thermal energy is comparable with the width of the gap Ae, and the transi- 
tion probability becomes of the order of unity. An electron which gets into 
the empty band can move under the action of an external field and can take 
part in conduction. 

In addition to thermal excitation, other mechanisms giving rise to the 
transition of electrons from the filled into the empty energy band may exist. 
As an example photoexcitation can be mentioned, in which the necessary en- 
ergy is imparted to electrons by irradiating the sample. 

The transfer of electrons from the filled band is accompanied by the ap- 
pearance of vacancies in it. The presence of vacancies allows the possibility of 
electrons of the filled band changing energy states. A vacancy in the filled 
band is called a hole, and the above mentioned process of filling vacancies is 
said to be the motion of holes. In an external field the electrons transfer 
to the conduction band and the holes in the filled band move. 

Bodies with such an electron spectrum are called intrinsic semiconductors. 

Semiconductors containing impurities of other substances and thus called 
extrinsic semiconductors play a particularly important role in modern physics. 

Let the atoms of the impurities be introduced into the crystal lattice of a 
semiconductor. The number of these atoms is considered sufficiently small to 
make it possible to disregard the interaction between them. 

Suppose the energy level arising from the impurity lies very close to the 
lower edge of the empty band. Then at 7#0 the thermal ionization of the 
atoms of the impurity will occur with high probability. We stress that the 
thermal ionization of the atoms of the impurity in semiconductors occurs at 
substantially lower temperatures than thermal ionization in gases. The latter 
usually occurs only at temperatures of the order of several thousand degrees, 
when the energy of thermal motion becomes comparable with the binding 
energy of electrons in atoms (the ionization energy). In semiconductors the 
situation is fundamentally different, since in a continuous medium with di- 
electric constant e the ionization energy decreases by a factor of e?, i.e. 
150—250 times. Thus, for example, the ionization energy of hydrogen atoms 
in germanium falls to about 0.016 eV ~ 150 K. 

The impurity atoms which provide electrons in the empty band are called 





§68 ENERGY SPECTRUM OF ELECTRONS IN SEMICONDUCTORS 357 


donors, and the semiconductor described is said to be an n-type extrinsic 
semiconductor. The index n emphasizes the fact that there are only negative 
free charges in the semiconductor. 

If the impurity atoms have an affinity for electrons and easily form nega- 
tive ions, the energy level arising from the impurity lies very close to the up- 
per edge of the valence band. Then electrons from the filled band pass to the 
levels of the impurity atoms. Such transitions are carried out at TÆ 0 with 
considerable probability, if the spacing between the energy levels of the im- 
purity and the upper edge of the filled band is sufficiently small. 

The vacancies in the filled band ensure the possibility of motion of the 
holes in it and thus of charge transport in an external electric field. The im- 
purity atoms which can capture electrons from the filled band are called ac- 
ceptors. Semiconductors containing an acceptor impurity are called p-type 
extrinsic semiconductors. The index p emphasizes that the conductivity of 
the semiconductor is associated with the motion of positive holes in it (hole- 
type conductivity). 

In real semiconductors the intrinsic conductivity is very often comparable 
with the extrinsic conductivity (n- or p-type); the semiconductor contains 
both acceptor and donor impurities. Finally, a very important role is played 
by spatially nonuniform conductors in which one part contains mainly an ac- 
ceptor impurity, whereas the other part contains mainly a donor impurity. 
However, we shall confine ourselves to the consideration of the three limiting 
cases mentioned. 

In the case of a semiconductor with intrinsic conductivity, one can write 
the current density in the form 


T five five» (68.1) 


where the summation over / refers to the quantum states of the empty band, 
and the summation over k to the states of the filled band. 
We can write (68.1) in the form 


j=-e 2 fivj-e zu wte? a-y = 
=—e D fiv;+e Ly (=f) V,= 


=-e2) fov.+e D Nie (68.2) 
i k P 





358 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


The sum eD;,v, is equal to zero, since in the filled band the number of parti- 
cles moving with the field is equal to that moving against the field. Formula 
(68.2) shows that the total current density is composed of the current of car- 
riers in the empty band and the current of carriers in the filled band. The first 
are electrons with charge (—e), the second are holes with effective charge 
(+ e). 

We have denoted the distribution function of the electrons of the empty 
band by fpa and the distribution function of holes in the filled band by fp 

To find the probability of finding an electron in the state with energy e 
one can write the Fermi distribution 


1 
In= exp [(e—w/kT] +1° oes 


Analogously, for the probability that an electron is absent from the state e, 
i.e. that there is a hole, we have 


1 


Ip= 1 In exp [(u—eykT] F Gee 
Introducing the new variable e’ = -e—AE, where AE is the width of the for- 


bidden zone, and measuring the energy from the lower edge of the conduc- 
tion band, we rewrite (68.4) in the form 


k l 
Jo exp (€ +u + AE)IKT] + 1“ 


The value of the chemical potential u (often called the Fermi level) is de- 
fined by the requirement of the principle of detailed balance 


Jfn4p =f fap (68.5) 


so that the semiconductor as a whole remains electrically neutral. 
In most intrinsic semiconductors the number of electrons in the empty 
band is so small that the electron gas is non-degenerate. Hence one can write 


fF elu-e/KT (68.6) 
Correspondingly, for the distribution function of holes we have 


fy = er Ut Bt eVKT (68.7) 


§68 ENERGY SPECTRUM OF ELECTRONS IN SEMICONDUCTORS 359 


Substituting (68.7) and (68.6) into (68.5), we find 


Qeu/kT c/kT 2 2 5 tjg 
ees =E =< e—-UtAE\KT | e—e'IkT '2 dp’ 
PE i e 4rp4 dp n3 e fe 4np + dp 
or 
u=—}AE-— }kT In(m,/m,)} . (68.8) 


Correspondingly, the distribution functions of electrons and holes take the 
forms 


3 
m a 7 
fn=( =) e—AE]2kT g—elkT , (68.9) 
m, \ž ap ' 
h= (=) e—AE]|2kT e—e'hkT . (68.9') 


Here we have assumed that the masses of the electrons in the conduction 
band and the masses of holes have different values mm, and m,,. 

Let us now turn to the case of an extrinsic semiconductor. For definiteness 
we shall assume that the semiconductor contains impurity atoms of only the 
donor type. 

Let Np be the total number of such atoms per unit volume of the semi- 
conductor, and (— Ep) the energy level measured from the lower edge of the 
conduction band. Then the condition for electric neutrality can be written in 
the form 


Paa 
Np= f fn gece, (68.10) 


where the integration is carried out over the conduction band (e > 0) as well 
as over the impurity atoms. 
For the latter one can obviously write 


d 
E ~ Npôle + Ep), 


oe 


360 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 


where 5(€ + Ep) is the delta-function of the energy. Then 


0 
Np = Np f [eH AT + 1]-18(e + Ep) de + 


co ; dp 
—p)/kT -1 = 
+ f tee n)/ +1] Je £ 
0 
= Nplexp[-Ep + w/a] + 1-1 + $ f (DRT 4 1)7} p? dp. 
3 
o (68.11) 


If the number of electrons in the conduction band is small enough so that the 
electron gas may be considered non-degenerate, then 


u/kT 
Np = Nplexp [— (Ep + u)/kT] + 1]7} ae ff e-/KT p2 dp = 
e 0 


= Nplexp [- (Ep + u)/kT] + 1]-1 +2 ( ae ) ` eulkT (68.12) 
l 


The condition (68.12) is quadratic with respect to e#/KT and makes it possible 
to express the chemical potential easily in terms of the characteristic quanti- 
ties Ep and Np. Solving eq. (68.12) gives 


B=—Ep +kT In}{[1 + 2NpA? exp (Ep/kT)/(2nmkT)?]2 — 1} . 





For 

Np (Ep/kT) > 

SS exp. T)> 1 

(2nmkT/n3)? p 

we have 

JEp+}4kT1 Np (68.13) 

ma-a a ———— 68.13 

PA ~ 22amkT/n2)3 





§68 ENERGY SPECTRUM OF ELECTRONS IN SEMICONDUCTORS 361 


Hence we obtain for the distribution function of electrons in the conduction 
band 





> Np aN ) Gut 
= -exp a). 1 
n 2(2amkT/h2): 2kT 


The number of conduction electrons increases with temperature according to 
the law exp (— Ep/2kT). 
When the inverse inequality 


D 
(2nmkT/n2)3 EREDA 


is fulfilled, it can easily be shown that 


Np 


a 
2(2amkT/h2)2 


(68.15) 


It can easily be seen that this case corresponds to the position of the Fermi 
level (u) being below the impurity level. Then all the atoms of the impurity 
must be fully ionized. The distribution function of electrons in the conduc- 
tion band takes the form 


Np 


=) ee eme/kT 
In 2(2amkT/h2)? > Seo 
The total number of electrons in the conduction band is then equal to Np, so 
that there is saturation. A further increase in the number of free electrons 
with increasing temperature is possible only by means of the transfer of elec- 
trons from the filled band. 

The distribution functions (68.15) and (68.16) contain the unknown quan- 
tities Np, Ep and m, which are found from experimental data. 

In the case of impurities of the acceptor type, all the calculations are 
carried out in exactly the same way. The resulting formulae are obtained by 
the replacement Np > Na, Ep > Ea, where the index A refers to the acceptor. 

If the semiconductor contains acceptor and donor impurities at the same 
time, and if the width of the forbidden zone Ae is comparable with Ep or E4, 
then cumbersome expressions are obtained for the distribution functions. 
However, in principle they do not differ from the limiting expressions 
considered. 


} 
p 
f 
i 
i 
f 





362 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
§69. The electrical conductivity and the Hall effect in semiconductors 


To calculate the electrical conductivity and the Hall constant in semicon- 
ductors use can be made of the corresponding formulae of §58 and §59, 
which are of a general character. The specific characteristic of semiconductors 
is the possibility of the existence of current carriers of two signs. Further- 
more, in most cases the electron gas and the hole gas in a semiconductor can 
both the considered non-degenerate. Then (58.3) gives 


Ne 
off 9 o-ckT de. 


j = — CE ————~,, Mye 
AWQnmkT/n2)i " 3e 


If the mean free path Ay is assumed to be independent of energy and equal to 
Xo, then 
‘0? 


A neff _ 4 Agere 
= Oe i pera 
(2amkT)2 





- 69.1 
: 2(2nmkT/h2)z (6:1) 


Correspondingly, we find for the electrical conductivity of a semiconductor 


2: 
eñgr 
4 oeff (69.2) 


o=; ———. 
3 (2nmkT): 


Since er depends on the temperature basically according to an exponential 
law, the electrical conductivity in semiconductors will be 


o ~ exp(—const/T) . (69.3) 


In front of the exponent there are factors containing different powers of the 
temperature. These powers are different for intrinsic and extrinsic semicon- 
ductors. However, they only affect the general temperature dependence of o 


slightly. 
We can also write the conductivity in the form 


É ae? ho Net _ e?r 


Cicer Tn "eff (69.4) 


where v is the mean velocity of thermal motion, 7 = Ao/v is the relaxation 
time, and wis a numerical coefficient. 
In the case of two kinds of carriers, electrons and holes, the expression for 











§69 HALL EFFECT IN SEMICONDUCTORS 363 


electrical conductivity can be written 





250 „(eD 25,0 ole) © 2 (hole) 
_ ae nD Nee "Note! er e e ne it ae Neff =: (69.5) 
MeV Mole? Ma 9 Mpoe Pole : 


Measurements of the Hall effect play an important role in determining the 
sign of current carriers in semiconductors. In calculating the Hall constant use 
can again be made of the general formulae of §59. From (59.14) we find 








-e Ly l (69.6) 
L, Ho(H) 
where 
Lis if Ae(Aw;,/v) dfo a m me) fo 
Ly 6 1+ (w/v)? ae ° y rar yas 


For real fields in semiconductors this relation can always be considered to be 


equal io 
L, [ F AWH ay a] J, 
a Ae ne A = 
Lı J v f 


_ eH (222) $ (69.7) 


= Ame \ kT 


In the first approximation, which is well fulfilled for semiconductors, 
o(H) = o(0)=0, 
where o(0) = ø is the electrical conductivity in the absence of a field as given 


by formula (69.2). 
From (69.6), (69.7) and (69.2) we find the following value for the Hall 





364 THE KINETIC PROPERTIES OF SOLIDS Ch. 6 
constant: 


a e 
Re resect (69.8) 





This is the value of R for a semiconductor with one type of current carrier. If 
both electrons and holes take part in the conductivity, then, obviously, one 
can write for R 


3m 1 1 


~ 8clel „el „hole 
neff "lett 





(69.9) 


The effective numbers of electrons and holes depend on the temperature, so 
that the Hall constant is not only a function of the temperature, but under 
certain conditions may change sign. 

The relations obtained are valid only for semiconductors in which the 
number of carriers per cm? is small and so can be considered as a classical ideal 
gas. For some semiconductors, in particular for germanium, this assumption 
is not fulfilled. In this case the carriers must be considered as a weakly degen- 
erate Fermi gas. In other words, the deviation of the distribution function 
from Maxwellian must be taken into account. We shall not dwell on a consid- 
eration of such semiconductors. Neither can we elucidate a number of effects, 
very important in practice, associated with the behaviour of semiconductors 
which are non-uniform in their properties. The theory of these effects is given 
in the specialist literature*. 


* See R.D.Middlebrook, An introduction to junction transistor theory (Wiley, New 
York, 1957); W.Shockley, Electrons and holes in semiconductors, with application to 
transistor electronics (Van Nostrand, New York, 1950); A.F.loffe, Physik der Halbleiter 
(translation from Russian) (Akademie-Verlag, Berlin, 1958). 


CE a a 





Interaction of Radiation 


with a Free-Electron Gas 


§70. Low-density plasma in a low-frequency radiation field 


Recent years have seen the discovery of a number of important astronomi- 
cal objects that are the sources of extremely intense radiation. It turns out 
that the characteristic radiation spectrum of these objects is essen tially differ- 
ent from the equilibrium (Planck) distribution, and has its intensity maximum 
in the region of radio or infrared waves. These objects, moreover, are sur- 
rounded by clouds (atmospheres) of a very low density ionised plasma. 
Apart from these powerful extra-terrestrial sources, radiating systems of a 
similar type have been developed under terrestrial conditions. These include 
lasers and modern UHF transmitters, where high-intensity radiation is created 
in the optical or radio wavebands. 

Therefore the behaviour of a low-density plasma in a field of high-density 
radiation has lately caused a great deal of interest. It emerged that not even 
such a simple system as that of free electrons in a radiation field was paid suf- 
ficiently close attention. On the other hand, a kinetic approach has just per- 
mitted a relatively complete investigation of the properties of this system. 
We shall therefore analyze in detail the behaviour of a free-electron gas in a 
radiation field*. 


* Footnote see next page. 


365 


= 


d 
| 
| 
| 








366 INTERACTION OF RADIATION Chew 


We consider a low-density plasma in a field of intense, low-frequency 
radiation. We restrict ourselves from the outset to the regime of non- 
relativistic electron energies. 

Any interaction of radiation with the heavy nuclei is obviously unimpor- 
tant as compared with direct interaction with electrons. Clearly, the condi- 
tion of electrical neutrality of the plasma cannot be violated: every change in 
the distribution of electron density induces a corresponding displacement of 
nuclei. If, however, the motion of the electrons is caused by a radiation field 
with a frequency that is large compared with the Langmuir frequency of the 
plasma, wp, then the collective motion of nuclei and electrons turns out to be 
insignificant. (More precisely, it can be shown that the frequencies w of the 
radiation field must be greater than woc/u, where u is the thermal velocity of 
electrons.) With motion in this type of field, the electrons acquire a cer- 
tain independence of the nuclei, and interact with the radiation as free par- 
ticles. It is therefore sufficient for our purposes to consider an idealised sys- 
tem — a low-density electron gas in a radiation field. We shall ignore the more 
subtle effects associated with the collective motions in the plasma. 

In examining the behaviour of free electrons in a low-frequency radiation 
field, two approaches are available. First, it is possible to perform the classical 
calculation for the motion of individual electrons in a wave field, and then to 
consider the statistical behaviour of a gas composed of these electrons. The 
second possibility is to discuss from the outset the behaviour of a system con- 
sisting of a free-electron gas and a photon gas. The properties of this system in 
the classical approximation can be obtained by taking the classical limit 
h-0. 

The classical analysis is possible and valid in a region of sufficiently low 
frequencies and high radiation densities. Here the occupation numbers n(k) of 
photons (cf. §101 of Part V) are large, and, as usual for large quantum num- 
bers, the classical limit can be taken in the quantum mechanical formulae 
(see § 100 of Part V). 

It should be emphasised, first of all, that the motion of an electron in a 
spectral radiation field (composed of a continuous set of frequencies and 
wave vectors) is essentially different from its motion in the field of an indi- 
vidual monochromatic wave. To illustrate this, consider the simple example 
of motion of an electron in a field of two waves which have the same frequen- 


* The contents of this chapter are based on the following papers: J. Zeldowich and 
E. Levich, Soviet Physics JETP 55 (1968) 2423; E. Levich, Soviet Physics JETP 60 
(1971) 112; T.Peraud, J. de Phys. 29 (1968) 88, 306, 872; A.Kompaneez, Soviet Phys. 
JETP 91 (1960) 876; T. Zeldowich and E. Levich, Soviet Physics JETP Letters 11 
(1970) 57. 


j 
! 


§70 LOW-DENSITY PLASMA 367 


cy but differ in phase. The equation of motion of a non-relativistic electron 
can be written in the form 


mize {E, +E, ++ [v x (H, + H3))} (70.1) 


We solve (70.1) by successive approximations; that is, we put 


| v=votVY], Ivil < Ivol, 
| (70.2) 
| dvo dv; e 
| my > e(E + E2), Mare [vo X (H; + H,)] . 
If we represent the field as 
E, = Eye! + complex conjugate , (70.3) 
E, = Ene! tia + complex conjugate , (70.4) 
H =E Xn, (70.5) 
H,=E,X n}, (70.6) 
we immediately. obtain 
si e(E, + E>) i : e 
Yo on ex conjugate. (70.7) 
For the average (over a period) dv, /d¢ Tye have 
—T 
dv, e? p A 
ERFA {E X H, +E, X H,}= 
2 p2 
BEE a sin &. (70.8) 
wem? 


It is clear that if the phase shift œ is non-zero, the electron experiences a mean 
constant (systematic) force 2e7 ER lwem? In contrast with the periodic motion 
in the field of a single wave, the electron now executes both oscillations and 
systematic motion. Thus an electron systematically accumulates energy. The 


sii. 











368 INTERACTION OF RADIATION Ch.7 


electron performs strictly periodic motion only when the two waves are par- 
allel, or when the total magnetic field (H, + H,) has a phase shift of 37 rela- 
tive to the total electric field (E, + E,). It should be stressed that the electron 
interacts simultaneously with the two waves, and therefore the constant force 
is proportional to the product of the field intensities. 

If we now consider the real spectral field, composed of an infinite set of 
harmonics wj, it is apparent that the motion of an electron in a radiation 
field becomes a statistical problem. The only significant qualities are averages 
over all values of random phases. Thus, even the motion of a single electron 
in a spectral electromagnetic field requires statistical treatment. The classical 
approach necessitates the superposition of an infinite number of frequencies, 
followed by averaging with respect to random phases. This is fairly cumber- 
some. Therefore we turn to the second of the above approaches and consider 
from the very beginning the behaviour of the quantum statistical system (free 
electron gas and photon gas). 

In a system consisting of free electrons and photons there are two funda- 
mental interaction processes: (1) Compton scattering, and (2) free—free tran- 
sitions (bremsstrahlung or absorption of photons in collisions between elec- 
trons). This system (electron gas + photon gas) also involves radiation-free 
collisions between electrons. In a real plasma these processes are supplemented 
by collisions of electrons with nuclei and the corresponding free—free radia- 
tion transitions, the latter being more significant than free—free transitions in 
the case of electron collisions. 

The Compton scattering of a photon by a free electron is accompanied by 
changes both in the wave vector and in the photon frequencies. It is simplest 
of all to describe the change of these quantities in a reference frame Kg in 
which the electron was initially at rest. In this reference frame we can express 
the change of momentum of an electron as 


hwg 1 h ' , 
Apy=— (l-1) + (%07 90) = Ap; + Ap), (70.9) 
where 
BW 
Ap =——(1-11); (70.10) 


l and I’ are respectively unit vectors in the directions of propagation of inci- 
dent and scattered photons, (wg and wp are corresponding frequencies). The 
quantity Ap, represents the change in momentum of an electron due to scat- 





eS ee as | 


§70 LOW-DENSITY PLASMA 369 


tering without a change in photon frequency, i.e., due to a simple change in 
its direction of motion. The quantity Ap, defines the change in momentum 
of an electron due to transfer to it of energy from a photon in the act of in- 
elastic scattering. Using the expression for the change in frequency in the 
Compton effect (cf. eq. (17.11) of Part 11), we can write 


2 





Wp — Wy = Awg = S (=I: l') (70.11) 
mce? 
from which it follows that 
hw f 
Ap> =—2 1 -1- Wye (70.12) 
mc? 


Obviously the ratio 


lApal hw 


—— n —— 


<1 
IApil me2 


is very small in the non-relativistic approximation. 
We note further that in a laboratory reference frame the change in fre- 


quency of a photon due to scattering by an electron with momentum p has 
the form 
! 





pl) pt) = 
Aw ~ Awg + Wo (1 +24) Wy (+23 =~ 


hw? 


2p. d- Maa tale ')+2 (p-(i-1')) . (70.13) 


~ Aw 
07 ine me 

In the same non-relativistic approximation, we can write wg =w. The prob- 

ability of Compton scattering of a photon with wave vector variation k > k’ 

in the solid angle dQ can be written as 


dao, k’ = cNV,[1 + n(k’)] n(k) do , (70.14) 


where n(k) is the number of photons with wave vector k and N, is the num- 
ber of electrons per unit volume. The quantity [1 + n(k)] (cf. eq. (103.7) of 
Part V) is connected with the Bose statistics of photons. In the long-wave ap- 
proximation, which is relevant for our purposes, the effective cross section 


| 
| 
| 
| 





370 INTERACTION OF RADIATION Ch.7 
do in the rest-frame Ko is given by the Thomson formula (cf. §36 of Part 1) 


2) \ 2) 
do=adQa= (£ 4(1 + cos?a) dQ , (70.15) 
mc? 


where a is the angle of scattering. 

We see that it is possible to distinguish between two types of scattering: 
spontaneous, proportional to the number of photons n(k), and induced, prop- 
ortional to n(k)(k’). In the classical limit of large occupation numbers the 
probability of induced scattering is much greater than that of spontaneous 
scattering. This is a general relationship between spontaneous and induced 
processes. The latter always correspond to the classical behaviour of a system. 
It follows that, in the classical limit discussed above, the mean motion of an 
electron in a spectral radiation field can be found by considering the induced 
scattering of photons by the electron. 

Bearing in mind the classical limit A > 0, we replace the number of photons 
corresponding to a given vector by their spectral distribution p(w, 1). In ac- 
cordance with eq. (76.9) of Part III, this spectral distribution can be related 
to the number of photons by the formula 


3 
AC) =t n(o, 1). (70.16) 
NEC: 


The spectral distribution determines the mean radiation energy at a given fre- 
quency. Obviously p(w, 1) is independent of the value of Planck’s constant h. 
We see that in the classical limit 


a4 





dw ~cN, ge p(w, 1) p(w, I’) do . (70.17) 
hw 

It should be emphasised that the number of photons remains unchanged in 
the scattering process. 

Free—free bremsstrahlung represents the radiation or absorption of pho- 
tons by an electron which is moving in the field of a nucleus or of another 
electron. The probability Wy, of a bremsstrahlung process is proportional to 
the square of the density M, of electrons, and has the form* 


Wer ~ NJw? . (70.18) 


* See, for example, B. Heitler, Quantum theory of radiation (Clarendon Press, Ox- 
ford, 1954). 





§71 KINETIC EQUATIONS FOR ELECTRONS AND PHOTONS 371 


The probability of bremsstrahlung processes in the case of a low-density elec- 
tron gas and high radiation density is, in general, much smaller than the prob- 
ability of induced Compton scattering. For this reason we shall neglect brems- 
strahlung except at low frequencies. Since Wer increases very rapidly with 
decreasing frequency, bremsstrahlung processes become predominant when 
the frequency is sufficiently small. Hence we have a second limitation on the 
frequency range under consideration, namely 


w>w,, (70.19) 


where w, is the frequency at which (for fixed densities of the electron and 
photon gases) the change in the photon distribution function due to brems- 
strahlung becomes greater than that due to the Compton effect. The frequen- 
cy w, clearly decreases with increasing density M, and increases with increas- 
ing p(w, !). 

We have assumed so far that the low-density plasma interacts with radia- 
tion of a known spectral distribution p(w, 1). It is however possible to formu- 
late the problem in a different way, namely, the change of the spectral radia- 
tion distribution function due to the interaction with the electron gas, which 
also arises. For example, in the case of interaction with an electron gas which 
is in an equilibrium state, the spectral radiation distribution changes (evolves) 
with time, while the state of the electrons remains practically invariant. This 
process will also be discussed below, with the same approximations as before. 


§71. Kinetic equations for electrons and photons 


First of all we consider the formulation of the kinetic equations for elec- 
trons and photons. In the kinetic equation 


df _ 
aime (71.1) 


where f= f(p, t) is the distribution function for electrons, the collision inte- 
gral can be expressed as 


TE dizlo (71.2) 


Here /,, is the collision integral for collisions between electrons, and is given 
by the Landau formula (34.7). 





Es 


——— 





372 INTERACTION OF RADIATION Ch. 7 


The change in distribution function of electrons due to Compton scattering 
is determined by the integral /ç. In what follows we shall assume that the 
system is homogeneous in space. Thus we shall suppose the distribution 
function f(p, t) for electrons to depend on the momentum vector and the 
time. Correspondingly, the distribution function for photons, i.e. their occupa- 
tion numbers, will be presented as z = n(k, t). Taking this into account as well 
as formula (70.15) for the scattering probability, we can write the collision 
integral for electrons with photons J, in the form 


Jo=— fag AP, D) + fae MDD, (71.3) 
where 
dW, =e[1 +n(k, ¢)] n(k', t) do dk’ = dw, /Ne - G13 


The first integral in (71.3) expresses the reduction in the number of electrons 
having momentum p, while the second represents the arrival of electrons in 
this state as a result of collisions with photons. 

For the case in which we are interested, that is of low radiation frequen- 
cies, the change in momentum of an electron through a single collision with a 
photon is small compared with its mean value. Therefore, the distribution 
function is a slowly-varying function of its argument, and, in accordance with 
results of §10, the kinetic equation reduces to an equation of the Fokker— 
Planck type. Without repeating the calculations of §10, we can write down 
the Fokker—Planck equation for three-dimensional momentum space, that is, 
for À = Py, Py» Pz, in the form 


G9 
dt Op; 





lar ec OaD) tle, ik=x,y,Z, (71.4) 


where the diffusion coefficient D;, and the mobility a; in momentum space 
are given by the formulae 


Dix = Sa;Avx dWy k's (71.5) 
aj= f ApidWk>k'> (11.6) 
Substituting for dWg— p’ from (71.3') we obtain 


Dig = Ap; Ap) = fp; Apgn(k) [1 +n{k]edkdo, (71.7) 








§71 KINETIC EQUATIONS FOR ELECTRONS AND PHOTONS 373 
a;= (Ap) = {Apin(k) [1 + n(k’)] c dk do , (71.8) 


where ( ) denotes the average over the photon distribution function. Note 
that the diffusion coefficients in momentum space characterise the random 
accumulation of energy by electrons, while the mobility characterises the sys- 
tematic force acting on an electron. In later sections we shall calculate the 
kinetic coefficients (71.7) and (71.8), and find the solution of eq. (71.3) 
under various conditions, 

It turns out to be expedient to calculate the kinetic coefficients in the ref- 
erence frame Kg in which the electron is at rest, since in this frame the scat- 
tering cross section has a particularly simple form. Moreover, as we have noted 
earlier, there is a simple expression for the momentum transferred. 

Before proceeding to discussing the kinetic equation for electrons, we ob- 
tain the corresponding kinetic equation for photons. In this case we restrict 
ourselves to an isotropic distribution of photons, i.e. n = n(w, t). 

Obviously 


di _ D 
ai we (71.9) 


where JO is the collision integral for collisions between photons and free 
electrons: 


JO=— fn, DU +n’, De dof(p,) dp+ 
+ |n(w’, t) [1 +n(w, t)] c do f(p', t) dp . (71.10) 


The first term gives the decrease in the number of photons of frequency w 
due to collisions with an electron of momentum p , while the second term 
gives the arrival of photons in this state. 

In the equilibrium state dn/dt = df/dt = 0. As we saw earlier, the integral 
Iee for collisions of electrons among themselves vanishes for a Maxwell distri- 
bution. It can easily be verified that the integrals J, and apy) vanish in the case 
when electrons have a Maxwell distribution with temperature T and a photon 
distribution as given by 


1 


nH, w) = e(hus—w kT _ 1 ’ 


(71.11) 


i.e. a Bose distribution function with the same temperature T and chemical 


—-- + o een ~ 


374 INTERACTION OF RADIATION Ch. 7 


potential u (not, in general, equal to zero). This last result allows a simple in- 
terpretation: in our approximation, when radiation and absorption of photons 
are neglected, their total number in the system is fixed. We recall (cf. eq. 
(76.6) of Part III) that, in general, when the number of photons is not fixed, 
the value u = 0 corresponds to a minimum of free energy. It is clear, therefore, 
that the processes of absorption and radiation, which occur during a suffi- 
ciently long period of time, transform (71.11) into the equilibrium Planck 
distribution andy > 0. 

Passing now to a consideration of the kinetic equation for photons (71.9), 
we shall show that it can also be reduced to a differential equation of the 
Fokker—Planck type. In the case of non-relativistic electrons and sufficiently 
soft radiation, ñw <mc2. Thus formula (70.11) shows that the change in 
frequency of a photon through scattering is small. We suppose that the elec- 
trons can be characterised by an equilibrium Maxwell distribution with tem- 
perature T. Then, introducing a new variable x = fiw/KT, such that the quan- 
tity AAw/kT is small, we can expand the photon distribution function in a 
series of powers of Ax. Restricting ourselves to the first terms of this expan- 

| sion we now obtain 


J®= - fap + n(x + Ax)] n(x)c do f(e) + 


+ fdpn(xt Ax) [1 + n@Œ)] edo f(e)= [1+ n(x)] fap f(e) Axc do + 


5 (2 n 
+= 
2 Lax2 








atm tm) c fap dota? ID. 


(71.12) 
The evaluation of the second integral in (71.12) is easy enough: 
M1! O22. 5 on a 
Je 7 E at 2 2(1 +n) n(1 +n) 
where 
| 1=c fap do(Ax)? f(e) = oN. (ee) x2, (71.13) 
me* 
Here, in the expression for Ax = AAw/kT, we have used (70.13), and have 
denoted by øy the total Thomson cross section, op = $n(e?/mc?)?. 
The evaluation of the first integral in (71.12) is much more difficult. How- 


j 
EE s 


§71 KINETIC EQUATIONS FOR ELECTRONS AND PHOTONS 375 
ever, this calculation can be replaced by the following general argument. The 
law of conservation of the number of particles in the process of scattering — 


in our case, the number of photons — requires that the kinetic equation 
should have the form 


PS Ve yj, (71.14) 


where j is the photon flux in momentum space. Because of the isotropy of 
the photon distribution function, the divergence of the flux has the form 


alas 2 x 2i(x)} . (71.15) 


There must be a linear relationship between the flux j and 07/0x. More- 
over in the equilibrium state (for a Planck distribution) the flux /(x) should 
be equal to zero. Thus we may write 


i) = EZ œ] gx), 


where g(x) and A(x) — two unknown functions — are to be found. It is easy 
to verify that for a Planck distribution the equality 


ðn 
O tn)n=—-—, (71.16) 


holds: it is easily verified by a direct calculation. Hence both of the above 
requirements on j will be satisfied if we write j(x) in the form 
I(x) = E +n(1+ m| g(x), (71.17) 


Correspondingly we find for yo) the general expression 


J®=-+ 2 b Dag )[2t+na+]| (71.18) 
x 





376 INTERACTION OF RADIATION Chie 


In order that the two expressions for JD, i.e. (71.18) and (71.12), should 


coincide, it is necessary to put g(x) = —x- 2orN, kT/mc. Thus we have finally 
an, t) __ kT Neo) “5 2 x4 x 
ðt AAN 
x one D+ n(x, t) + [n(x, D]? . (71.19) 


Notice that although (71.19) is similar to a Fokker—Planck equation, it is not 
quite the usual equation of this type. The presence of the factor 1 +n in the 
transition probability has resulted in the fact that (71.19) is nonlinear with 
respect to the unknown function n(x). 

The kinetic equations for electrons, (71.4), and photons, (71.19), ob- 
viously constitute a coupled system. To find the general solution of this sys- 
tem is an extremely complex problem, so that we shall consider only two 
limiting cases: (1) to find the electron distribution f{p, £) for an arbitrary, 
prescribed, photon distribution; (2) to find the photon distribution function 
n(w, f) for a given equilibrium distribution of electrons. 


§72. Kinetics of Bose condensation in a photon gas 


We shall first consider the second of the above problems: the investigation 
of the change in the properties of non-equilibrium radiation as a result of 
interaction with free electrons. In other words, we shall find the law of evolu- 
tion in time of a photon distribution n(w, t) due to interaction with an elec- 
tron gas. As we remarked earlier, it will be assumed here that the state of the 
electron gas is prescribed and stationary. Namely, we shall suppose that the 
electron gas is in an equilibrium state for all time, and is characterised by 
some temperature Tp. 

We shall completely neglect free—free transition processes, and assume 
that scattering is the only form of interaction between photons and electrons. 
The role of absorption will be considered later. 

The kinetic equation (71.19), which described the time variation of n(w, t), 
must be supplemented by initial conditions. Let mp(w) be the equilibrium 
(Planck) distribution of photons corresponding to the temperature T. Two 
cases are then possible: 


rr E 


j 
' 
if 
f 
} 


§72 BOSE CONDENSATION IN A PHOTON GAS 377 


(1) at the initial moment, ¢ = 0, n(w, 0) is such that the total number of pho- 
tons in the system 


co 


I Mf f n(x, 0) x? dx 
0 


is larger than the total number of photons in equilibrium radiation 


oo 


Np = f maxa, (72.1) 
0 


that is, V > Np; 
(2) at the initial instant 
N<Np. (72.2) 
It is clear that the evolution of n(x, t) will be different in the two cases. In 
the first case the photons will tend to lose energy in collisions with electrons, 
and their spectral evolution will, on average, consist of a trend downwards, 
i.e. towards the energy axis (‘cooling’ of the photon gas). In the other case 
the photons will in the mean gain energy through collisions (‘heating’ of pho- 
tons). 

Since we neglect the absorption of photons, their total number will remain 
constant during the evolution process. We begin by considering the ‘cooling’ 
of the photon gas. As was pointed out earlier, we are interested here in the 
region of comparatively low frequencies. In this region the exchange of ener- 
gy between a photon and an electron during the process of scattering is small, 
and the kinetic equation (71.19) is valid. Moreover, the occupation numbers 
n(x, t) will be large. As was pointed out earlier, this is the region where clas- 
sical electrodynamics holds. 

If we assume that n > 1, and, furthermore, that the inequality 

on 
n2> ar] ; 





(12.3) 


is fulfilled, then eq. (71.19) can be substantially simplified, i.e. we can omit 
the small (compared with n?) terms n and 0n/dx. The validity of assumption 
(72.3) can be verified a posteriori by direct calculation. Thus, we have in- 





378 INTERACTION OF RADIATION Ch. 7 


stead of (71.19) 


Q 


et 


4,,2 72 
eo (x*n) , (72.4) 


g| 


SJ 


where we write t'= (oN .ckT/mce?) t. Introducing a new unknown function 
f=x?n(x, t), we find 


a 2 
a pA 25) 


The solution of this equation in characteristics is 
x =F(f)—2ft' , (72.6) 


where the form of the function F(/) is determined by the initial conditions. 
The meaning of this result is best understood by reference to the example 
depicted in fig.VI.14. According to (72.6), all points located on the initial 
curve f(x,0) = x2n(x,0) (curve 1) move along characteristics, which are 
straight lines parallel to the axis in the direction of decreasing x (curve 2 in 
fig. VI.14), and with velocity proportional to x. The time required for a given 
point to reach the f-axis is obviously given by the expression 


7 = F(f)/2f. (72.7) 


The solution (72.6) is formally applicable for all values of x, both positive 
and negative. In the course of time, therefore, f(x, £t) must take the form re- 








Fig. VI.14 


§72 BOSE CONDENSATION IN A PHOTON GAS 379 


presented by curve 3 of fig. VI.14. It is clear, however, that since the variable 
x represents the energy of the photons, it is impossible for x to attain negative 
values. The particles which have arrived at the state x = 0, i.e. at the zero- 
energy state, cease to move. In other words, the photons accumulate in a 
state with zero energy, which means in a Bose condensation state. 

The number of particles passing to this state is formally determined by the 
shaded area in fig. VI.14. It must be emphasised, however, that this picture of 
the kinetics of photon transition to a zero-energy state is incomplete, and 
must not be accepted literally. In a very low frequency region absorption al- 
ways plays an important role (cf. §70), no matter how small the concentra- 
tion of the electron gas (or, more precisely, of the plasma). 

If the system (electron gas + photon gas) is closed for a sufficiently long 
time, it will always evolve to a statistical equilibrium and the photon distribu- 
tion function will become Planckian. Therefore it would be senseless to look 
for the stationary solutions of eq. (71.19) without taking bremsstrahlung into 
account. On the other hand, formula (72.4) gives the dynamics of the change 
in the photon spectrum for the time ¢” and in that frequency range where 
bremsstrahlung is of no importance. The time-variation of the initial distribu- 
tion n(x,0) is illustrated by the curves drawn in fig.VI.14. Each of these is a 
kind of snapshot of the distribution for ż lying in the interval 0< z <2". 
Since the kinetics of the process depends essentially on the form of the initial 
distribution, it is interesting to examine the case when f(x, 0) has the form 
shown in fig. VI.15. According to (68.6), the function /(x,4) will deform in the 
course of time as shown in fig-VI.15. The rate of approach to the ordinate axis 
will be the greater, the greater the value of f. The upper parts of the curve are 
being advanced while the lower ones are being retarded. As a result, the dis- 
tribution function can be deformed to such an extent that the curve of f(x, £) 
turns out not to be single valued — this is illustrated in fig.VI.15. 


f(t,) 
f(x,t)) 


flt) 





Fig. VL1S 





ie Ses 


380 INTERACTION OF RADIATION Chn 


A similar situation arises in the case of the generation of shock waves in 
the uniform flow of an inviscid fluid. In reality, of course, f(x, t) does not be- 
come multivalued. Near the shock front the derivative 0”/0x increases with- 
out limit; inequality (72.5) is violated and eq. (72.4) becomes inapplicable. 
Moreover, the change in the distribution function becomes abrupt in the 
vicinity of the front, and the general equation (71.19) is also inapplicable. A 
detailed picture of the motion of the front f(x, t) in this case becomes much 
more complicated. 

The second stage of evolution begins when f >”. Due to absorption, an 
equilibrium Planck distribution is established in the photon system, with the 
temperature of the photons equal to the temperature T, of the electrons. 

Let us now consider the second case, where inequality (72.2) is satisfied at 
the initial instant. The evolution of the photon spectrum again takes place in 
two stages. In the first stage, with ¢<r*, a Bose—Einstein distribution with 
non-zero chemical potential is set up in the photon system. Here the mean 
energy of photons becomes equal to the mean energy of electrons. 

Next, absorption and radiation processes come into play. The chemical 
potential begins to vary with time, and after the elapse of time t > r*, u > 0, 
i.e. a Planck distribution is established in the system. 


§73. Mobility of an electron in a radiation field 


Before proceeding to consider the behaviour of a system of electrons in a 
radiation field, we must deal with the important question of the behaviour of 
a single free electron in such a field. 

The behaviour of an electron in the field of an electromagnetic wave was 
discussed in §29 of Part 1. There we saw that the electron scatters an electro- 
magnetic wave impinging on it. It is clear, however, that, since the process of 
scattering involves transfer of momentum from the wave to the electron, 
some mean force is generated which acts on a particle radiating secondary 
scattered waves. By eq. (29.2) of Part I, the mean force (over a period) acting 
on a radiating particle (in a reference frame in which the particle is at rest: 
henceforth we shall omit the subscript zero for this reference frame, so long 
as no ambiguity arises from doing so) is 





2 COF D ae 
ee ve 2 (Żomm) z 
O diz. 3c3 a mec 
(i ene 2 \ATSD 
z= ze (EX H)= in (<) cae 
3mc4 me 4n 





Le = 


§73 SINGLE ELECTRON IN RADIATION FIELD 381 


where | is unit vector in the direction of propagation of the wave. Introducing 
the effective Thomson scattering cross section and the energy density of radia- 
tion ug, we arrive at the formula for radiation pressure force 


g 
Fo= orug =< q, (13.1) 


where q is the energy flux in a single wave. The force acting on the electron — 
the radiation reaction force — is independent of the frequency. Therefore, the 
force acting on the electron due to a spectral field generated by a collection 
of waves with different frequencies can also be expressed in the form 


F,=— q. (73.2) 


c 


Here q denotes the total energy flux in the radiation field 
l 
= | chun(w,1)1 dk =72c4 | p(w, 1) — dk. (73.3) 
ey Jeo D 


In place of the energy density we have introduced the spectral function 
p(w, 1). 

In the transition from (73.1) to (73.2), i.e. from the force on an electron 
due to a monochromatic wave to the equivalent force in a spectral radiation 
field it has been implicitly assumed that additivity of the forces due to the 
individua! waves is valid. In this summation the effect of induced scattering, 
mentioned previously, has been ignored. Here the following fact turns out to 
be significant: if we do not take into account that part of the change in mo- 
mentum, Ap , which is associated with the change in frequency at scattering, 
and whose contribution to the total change in momentum we know to be 
small (cf. (70.9)), and if we also ignore any change in the spectral form of the 
distribution function due to change in frequency, then the contribution of 
the induced scattering to the reaction is identically zero. 

On the other hand, the preceding discussion makes it clear that if we do 
take into account frequency variation the formula for the reaction is essen- 
tially different. 

Thus, we shall have to find an exact expression for the reaction without 
any arbitrary assumptions. By an ‘exact’ expression we mean an expression 
for the reaction in the classical non-relativistic approximation (without quan- 
tum or relativistic corrections). For this calculation it is convenient (although 





es 


382 INTERACTION OF RADIATION Ch. 7 


in principle not obligatory) to use a quantum mechanical approach, based on 
the application of the general formula for mobility a;. Going to the classical 
limit A > 0 in the resultant formulae we obtain the classical expression for the 
mean force. This example illustrates the usefulness of the technique discussed 
in the previous section: finding classical formulae from quantum formulae in 
the limit fi > 0. 

In accordance with (71.8) the mean force or the mobility is given by the 
general formula 


(Ap) = fe(Ap; + Apo) n(k) [1 + n(k’)] do dk . (73.4) 
We now use the fact that the change in frequency Aw of a photon for a single 
scattering at an electron is small. (It is convenient here to use the frame Kọ; 
the subscript zero is omitted for simplicity.) We write 

n(k) = n(w, 1) ; n(k’) = n(w', 1’) 


where we have made the approximation 


n(w, 1’) ~n(w, 1) + —w)=n(w, 1)- Aw. 


Replacing the occupation number n(w, 1) by the spectral distribution p(w, 1) 
we have 


(Ap) = f(a- (Aw) 1’ | = Pe pw, 1)x 


2,3 i; 2,3 ' 
1 + ew, 1") n?e ANO [e0] mee EE 
hw? h dw O3 


=J (eaa to nod jen ee 
nc? hw 


X 











Ret Bes 





ne? a [ee D| | Qu? dw 
Aw 2 dQ, a dQ, 73.5 
h dw w3 (2nc)3 k (39) 


where dQ, is an element of solid angle for the incident photon and where we 
have substituted the value (70.11) for Aw. 
We now take the limit A > 0 in formula (73.5). To do this it is necessary to 


§73 SINGLE ELECTRON IN RADIATION FIELD 383 
retain all the terms which do not contain 7i. The terms ~1/fi cancel exactly 
after integration over angles. As a result we obtain for the total force acting 


on the electron 


2,73 
(Ap) = fo == p(w, (0—1) dk d9 + 
w2 


fons 





aa ik dQ — 


ntc ð [elow he “ty dw x 
A d—V)w21 -1 ee 


= ea ap Bate (73.6) 


It should be emphasised that (Ap) represents the mean force in the classical 
approximation. Therefore exactly the same expression could have been ob- 
tained by corresponding calculations with the relations of classical electro- 
dynamics. At the same time it should be stressed that in the classical approxi- 
mation we do not use an expansion in terms of the powers of some small 
parameters. In this sense (73.6) is an exact formula. 

We examine separately each of the terms in (73.6). The first term can be 
written in the form 


FS = fot We (eo, 1)(1—1') dk dQ = 


= opne? (foc. 1) saat) ISE (73.7) 
C 4n 


We see that Fy is the force due to spontaneous scattering. It coincides with 
the radiation pressure Fp obtained above. It is evident from its structure that 
the second term 


Fia= = floor p(w, ee l’),, 


om I'l -1-1') dw dQ, dQ — 


, (73.8) 
-SEE p(w, 2 [eD] MRA dw dQ, da, 
Ww 


represents the force due to induced scattering. This expression contains under 


ee 


a 


384 INTERACTION OF RADIATION Ch. 7 


the integral sign the product of the spectral functions p(w, 1) and p(w, |’) of 
incident and scattered radiation. 

The character of the force Fipg depends on the form of the function 
p(w, 1). If in particular the radiation is strictly monochromatic, i.e. the radia- 
tion density is non-zero only for a fixed value of the frequency and wave 
vector, then p(w, I’) = 0, and the induced force vanishes identically. 

If p(w, 1) is the equilibrium Planck distribution pp(w, T), we obtain, after 
substituting for p(w, 7) and performing some simple calculations, 


kT 
i= 
me me2 


ar F (13.9) 


sp“ 


Since in the non-relativistic region the radiation temperature satisfies the in- 
equality kT <mc?, Fynq represents a small correction to the spontaneous 
force Fy. 

Conversely, in an arbitrary high-intensity radiation field F;,,4 can be com- 
parable with and even substantially larger than Fy. We shall consider the in- 
duced reaction force for some concrete forms of spectral distribution. Sup- 
pose, for example, we take as the radiating element the uniform surface of a 
disc of radius R. Since all points of the disc radiate in the same way, the func- 
tion p(w, l) can be written in the form 


p(w, 1) = G(w) vl) . (73.10) 


where the first factor represents the frequency and the second the spatial dis- 
tribution of radiation. The expression for F;pq takes on a particularly reveal- 
ing form if we choose a simple function G(w), for example, 


a/w, WySw<ce, 
G(w) = | (73.11) 
OF WLW. 


We may note that this spectral distribution is associated with certain astro- 
nomic sources of intensive radiation. Substitution of (73.10) and (73.11) now 
gives 


o0 


F. saae f do FAN AN — H) dQ, d2. (73-12) 


ind 4m 2y+2 
og Z y 








§73 SINGLE ELECTRON IN RADIATION FIELD 385 


The second integral in (73.12) vanishes, since its angular principal part changes 
sign under the transformation | > —1’. 


For a disc at a large distance r > R from the surface we can write 
(1) = (8, Y) = O(cos B — cos Bg) ~ OB? — 62) , (73.13) 


where 8 and y are polar and azimuthal angles, © is a step function, Bo is the 


angle of the cone, and By ~7/R <1. After evaluating (73.12) we obtain for 
the force 


30o O- BG 


dims qq (73.14) 
T 8mo (2y aye Des 


We see that |F;,4| turns out to be proportional to q?, and for large values of 
radiation flux (and small values of ww) it can be substantially larger than IF pl- 
Note, however, that this ratio falls off rapidly with distance from the radiating 
surface (like p6 = (r/R)6). 

A second interesting example is the reaction force in a highly anisotropic 
field. We take as its source two streams of radiation in opposite directions. 
Let one of these be coherent (monochromatic), and directed along the posi- 
tive z-axis: 


ni (w) = 78 (w — wp). (73.15) 


The second beam is directed along the negative z-axis and is characterised by 
some spectral distribution 1(w). Using the general formula (73.8) for the in- 
duced reaction, we arrive, after a few calculations, at an expression for the 
component BOE 





h2o Atl) 3 
FO ~—F ho? is in) (73.16) 


me> 





WFWo 


Formula (73.16) shows that Fe) can have either positive or negative sign, 
depending on the value of the derivative 


- [n2(w) wii 


The spontaneous force F in this example is always in the direction of that 
beam which has the higher density, i.e. whose energy flux q is greater. Thus, 


WWo 





386 INTERACTION OF RADIATION Ch.7 


depending on the sign of the above quantity and on the relation between the 
absolute values of Fing and Fp two situations can arise: either the resultant 
reaction is in the direction of the total energy of the two radiation beams, or 
it is in a direction opposite to the total radiation flux. This seemingly para- 
doxical result again emphasises the importance of the induced scattering effect. 
In conclusion we obtain an expression for the force on a moving electron 
with v <c. The spontaneous force can be deduced from (73.6) by a Lorentz 
transformation for the frequency with v <c (Doppler effect!). It gives 


4o7Ugp 
PS ae (73.17) 


It turns out that the induced force is given by the formula 


a orn? 13 r[ow)]2 14 {a [ow)])\2. 6 
maa PIS SJR] dw- is Sam (AS |] dw]. (3.18) 


m? w3 





§74. System of electrons in an arbitrary radiation field 


We can now proceed to find the stationary, non-equilibrium distribution 
of a system of electrons located in a given radiation field. We shall assume 
that the field is isotropic. In this case we assume that the distribution func- 
tion n depends only on the frequency, and not on the direction of the wave 
vector, i.e. n =n(w). In this simple case it is convenient to analyse the princi- 
pal assertions of the theory. 

From (71.4) we find for the stationary distribution of electrons 


ð 
Tag aif = ji, (74.1) 


where j; is defined by formula (34.8). 

To solve (74.1) it is necessary to know, apart from the mobility which we 
found before, the diffusion coefficient D;;= D. Like the mobility, the diffu- 
sion coefficient is in general a function of the momentum. 

For our purposes the expression for the mobility a; can be limited to the 
approximation 


a;= (Ap) =F, (74.2) 
1 l sp 





§74 SYSTEM OF ELECTRONS IN ARBITRARY RADIATION FIELD 387 


where FO; is given by (73.17). The reason for this is as follows: as we shall see 
later, the electrons in the radiation field may arrive at a stationary state witha 
very high temperature. As will be shown later, if the induced reaction force 
Fing is of the same order of magnitude as the spontaneues force Fp» the 
electrons will acquire a mean energy exceeding mc?. But, as has Been em- 
phasised from the outset, the present theory is restricted to the case of non- 
relativistic electrons. 

In calculating the diffusion coefficient from formula (71.5) we can con- 
fine ourselves to taking into account the change in momentum due to per- 
fectly elastic scattering alone. Inelastic scattering gives only a small quantum 
correction. It is convenient to calculate the diffusion coefficient in the rest- 
frame of an electron Kg. 


Here we have for D: 


D=(Ap? = 4c fon (e9) [1 + ng(24)] (Ap)? dkg d2 = 


hwy 
=S) 
= te fo(™ 


where the subscript zero refers to the number of photons and their frequency 
in this coordinate system. 

To go over to a laboratory reference frame it is necessary to perform a 
Lorentz transformation. It turns out, however, that in calculating the diffu- 
sion coefficient in the non-relativistic approximation we need take into ac- 
count only terms of order of magnitude zero with respect to the quantity 
p/mc. This is connected with the fact that the diffusion coefficient in eq. 
(74.1) is multiplied by the derivative 0f/dp;. As a result, as we shall see below, 
if we include terms of order p/me in a;, we need include only terms indepen- 
dent of p in D. In this case D(ðf/ðp;) will automatically be of the same order 
of magnitude as a;f. 

In keeping with this we can put 





2 
) (ad— 1n (o [1+ n (29) dk, dQ , (74.3) 





is a p cos@ ðn p 
= = + = PES rh 
No(Wo) = n(w) =n (0 Wo wie )~ n(Wo) 5 ace FT cos 0 


and we obtain for D 


a, 1 hw 2 "2 = 
D= } 4c Gar (cos @ — cos 8) n(w) [1 + n(w)] o dQ dk = 





= $nore see) n(w) [1 + n(w)] e 23. (74.4) 


meeeg a e he 


388 INTERACTION OF RADIATION Ch. 7 


On substituting (74.2) and (74.4) into (74.1) we have finally 


$ ħw\ 2 2d a 
[sore f (42) aw) oes pt 


OyT 2 
a iN At fron! dw 
3 cm (27c)? 








| pf+j=0. (74.5) 


We shall try to find the solution of (74.5) in the form of a Maxwell distribu- 
tion with some effective temperature 6, i.e. we put 


f~ eP*/2me | (74.6) 


As was shown in §34, if we put the Maxwell distribution (with any tempera- 
ture @) into the integral /;, the latter automatically cancels exactly. Therefore 
eq. (74.5) can be satisfied if we set 


Jr) [1 + n(w)] (fiw)? w? doo 
ass : (74.7) 
4 [n(w)Rew? deo 





We see that when the effective temperature is given by formula (74.7) the 
Maxwell distribution becomes the exact solution of the equation (74.1) even 
for the case of a stationary state of the system of electrons in a non-equilibrium 
radiation field. 


§75. General discussion of the results and range of the applicability of the 
theory 


It is now possible to sum up the results of our calculations. It is necessary 
first of all to emphasize the remarkable peculiarity of the induced pressure 
force: this systematic force arises as a result of the action of a stochastic elec- 
tromagnetic field. A clearer understanding of the interrelationship between 
spontaneous and induced pressure forces of the radiation can be gained by 
the introduction of the so-called brightness temperature, the latter being de- 
fined by the formula 


12c3p(w, 1) 
=: 


KT, (@) = 





§75 DISCUSSION 389 


In the case of equilibrium radiation the brightness temperature coincides with 
the radiation temperature. In this case 7), is proportional to the energy densi- 
ty of radiation. 

Let us now consider the non-equilibrium spectrum of radiation. Moreover 
we assume that the intensity of non-equilibrium radiation in the low- 
frequency region is considerably higher than that of equilibrium radiation. 
Then Tẹr will be equal to the temperature of an equilibrium radiation for 
which the spectral distribution in the low-frequency region coincides with 
that in a non-equilibrium spectrum. In the case when the intensity in a low- 
frequency region is much higher than in the equilibrium spectrum, the bright- 
ness temperature will be very high, being independent of the mean energy of 
radiation. 

As is clear from (73.14) we have, in ascending order of magnitude 


kT (w) kT» 
Bil toupee (75.1) 
mes me~ 





For equilibrium radiation we always have |Fj,4l < lFspl- On the other hand, if 
the non-equilibrium spectrum has a brightness temperature 7), such that 
Tor © me?, the inequality |Fj, ql > IF spl is fulfilled. 

Now we shall turn our attention to the question of heating the electron 
gas in the radiation field. The main result arrived at is the existence of a Max- 
well distribution with an effective temperature given by formula (74.7). 

The physical meaning of this result is that for small momentum transfers 
in an elementary interaction, the motion of an electron in momentum space 
is a Brownian motion. After a certain relaxation time, a system of electrons 
located in an isotropic radiation field reaches a steady state. A random 
(Gaussian) distribution of momenta is set up for the electrons with an arbit- 
rary distribution of quanta with respect to frequency, n(%w). In an isotropic 
field this random distribution is Maxwellian with modulus (temperature) 6. 
In an equilibrium radiation field, when n(qw) is given by the Planck formula, 
from (74.7) we obtain the obvious result 0 = 7, i.e. the radiation and electron 
temperatures are equal. If, however, the distribution of quanta is non- 
equilibrium, the effective electron temperature depends on the form of the 
function 7(w). 

It is interesting to consider the case where the spectral distribution n(w) 
has a maximum in the region of low frequencies. In this region the occupation 
numbers n(%w) are very large compared with unity. Therefore we can leth > 0 


| 





390 INTERACTION OF RADIATION Ch. 7 
in formula (74.7), i.e. we pass to the classical limit. Thus we obtain 


[rho w? dw 





Xa = n? 7 (p(w))2 du 
6 7 if ot (75.2) 


4 [n(ies) wo? deo i 

In this case, the induced Compton effect dominates the process of electron 
gas heating. As the process of heating is of a purely classical nature, both 
formulae (75.2) and (73.8) can be derived in a purely classical manner, i.e. in 
a random radiation field. In connection with formula (75.2) we observe that: 

(1) When the radiation density in the low-frequency region is large enough 
the electron temperature becomes very high. This means that the average 
energy of electrons can be much greater than the average energy of photons. 
It is necessary to emphasize that the heating is due to the action of low- 
frequency radiation. The energy transfer per unit act of interaction with a 
photon is very small. But the sheer number of low-frequency photons is so 
large that the total energy transfer becomes quite considerable. 

(2) It is clear that the effective temperature of electrons is of the same 
order of magnitude as the highest brightness temperature of radiation and al- 
ways 0 < Thr: 

Therefore we can afford to neglect the effect of frequency change in the 
process of scattering in the expression for the mobility (74.2). Formula 
(70.10) for the change of momentum becomes incorrect and formula (74.5) 
becomes invalid just when the temperature 0 rises to the relativistic value 
0 ~ mc2. In this case the non-relativistic treatment is inadequate. 

(3) When calculating the integral in the numerator of (75.2) it should be 
taken into account that in the region of very low frequencies the process of 
bremsstrahlung becomes dominant over that of scattering. Therefore the in- 
tegral in (75.2) cannot diverge and 0 always has a definite value. 

The induced scattering has another very surprising and important pecu- 
liarity which has never been mentioned before. As is well known the spon- 
taneous scattering cannot occur in homogeneous media or in a spatially 
homogeneous system. This is easy to see from formula (73.1). If in this for- 
mula we let e > 0 and simultaneously m > 0 we shall pass to the case of scat- 
tering by an infinitely small charge with an infinitesimally small mass. This 
means that we shall go from the point charge to a continuous homogeneous 
charged medium. In this limit, we immediately see that the accelaration of an 
infinitesimal element of media tends to zero. Thus this element will not 
radiate any secondary waves. 

This result is easy to understand from the following simple consideration: 





§75 DISCUSSION 391 


let us consider two parallel infinitely thin layers in homogeneous media. Let 
us assume that these layers are located at the distance that is equal to half of 
the wavelength of incident radiation. Then secondary waves radiated by one 
layer will be fully damped by secondary waves from another layer. 

As a matter of fact spontaneous scattering can take place only in inhomo- 
geneous media. In particular this inhomogeneity can arise as a result of fluc- 
tuations in homogeneous media. 

The situation is quite different in the case of induced scattering. From 
formula (73.14) for the induced pressure force we see that the acceleration of 
a unit charge is proportional to (e/m)*. If we pass over to the limit e > 0 and 
m->0O we obtain the remarkable result: the acceleration of an infinitesimal 
element of an homogeneous continuous medium in a radiation field is not 
zero. This means that induced scattering exists even in homogeneous media. 

It is possible to interpret this result in the following way: during the act of 
induced scattering one wave virtually forms inhomogeneity in the medium, 
the second wave being scattered by this inhomogeneity. 

We consider now the range of applicability of the formulae we have ob- 
tained. First, however, it is necessary to discuss in more detail the behaviour 
of an electron in a strong monochromatic wave. By a ‘strong’ wave we under- 
stand the following: in the derivation of the Thomson formula, §36, Part I, it 
was assumed that the wave was scattered by a mobile electron which acquires 
a wave velocity v ~ e£/mw 2c. The action of the magnetic field on the elec- 
tron is entirely neglected here. With increasing electric field strength £, the 
velocity of the electron in the wave increases and can become relativistic, 
uv ~ c. The behaviour of the electron in the wave is then radically altered. The 
magnetic field will give it a rotational motion, so that its complete trajectory 
will be a closed curve in the form of a figure-of-eight. In this trajectory the 
electron will radiate electromagnetic waves with various frequencies, differing 
from the frequency of the incident wave*. The cross section of scattering now 
turns out to be different from the Thomson cross section, and can be found 
from simple energy considerations. By (73.1) the energy acquired by an elec- 
tron per unit volume can be represented in the form ouge. 

On the other hand, the loss of electron energy in the magnetic field (when 
v ~c) is given by formula (25.12) of Part II. In the steady state 


_ [de 
Toloc = ae ane 


* For the radiation spectrum of an electron moving in a magnetic field, see L.D, Lan- 
dau and E.M. Lifshitz, Course of theoretical physics, Vol. 2, Classical theory of fields, 
§74 (Pergamon, Oxford, 1959). 








392 INTERACTION OF RADIATION Ch. 7 


Hence we find for the cross section 


272 = 
Pal o mete eE )*o a, 
Tuge \dt) ad ugc? ~ (mwc T aL? 





where a ~ eE/mwc. We see that the scattering cross section must increase 
rapidly with increasing field strength. It should be noted that we have used 
formula (73.1) for the energy transfer of an electron in its own reference 
frame. This was possible because the rotational motion of an electron does 
not cause any increase in energy when averaged over a period. 

It is however evident that the cross section cannot rise indefinitely, since 
the energy acquired by the electron (in unit time) cannot exceed the quantity 
eEc. This quantity represents the work by the field per sec on an electron 
moving with the velocity v = c. Hence we always have a? o-~cE <seEc: 
When a= (eE/mwc) ~ (Nro)?, where À is the wavelength and rọ the classical 
electron radius, the cross section begins to decrease as 1/E. For a<1 the 
Thomson theory of scattering becomes insufficient. The electron in the field 
of a strong wave cannot be treated as a free particle. These results can be in- 
terpreted in the language of quantum mechanics in the following way: the 
most probable process in a weak wave field is single-quantum scattering, i.e. 
the ordinary Compton effect. 

In the field of a strong wave the most probable process-becomes multiple- 
quantum scattering, in which the number of quanta in the act of scattering is 
not conserved (frequencies different from that of the scattered wave are 
radiated). 

Now we shall pass to a discussion of the application of the theory devel- 
oped above. For the case of the spectral radiation field the first, obvious, con- 
dition for the applicability of the theory is the requirement that the character- 
istic time of energy accumulation by an electron should be small compared 
with the inverse of the characteristic frequency of the spectrum. Otherwise 
the radiation field cannot be a random field for the electron. 

We estimate this characteristic time, which is equivalent to the retardation 
time for an electron in a spectral radiation field. 

Assuming for simplicity that the latter is isotropic, we have, if v~ c 


TIF nal Sime > (75.3) 


ee. a Lt a e —_—— =—s—- —_ _ — —_ 


§75 DISCUSSION 393 


Hence we have the inequality 


mc l 
[Final @° 





(75.4) 


Or, using the formulae for IF,,y! we obtain the equivalent inequality a < 1. 
The last inequality is identical with the condition for the applicability of the 
ordinary theory of spontaneous scattering. 

The second condition for the applicability of the theory of induced scat- 
tering is the inequality 


lFinale <eEc . (75.5) 


Substituting the value of Fing we again arrive at inequality (73.4). This result 
shows that in the case of induced scattering there cannot be an increase in the 
cross section compared with the Thomson cross section, irrespective of the 
value of the parameter a=eE/mcw. It must be emphasised that the above 
two conditions for the applicability of the theory of induced scattering in a 
spectral field are completely independent. Whereas the first of them expresses 
the essential character of the spectral field, the second relates to the existence 
of a limiting velocity of propagation of interactions (v < c). When the param- 
eter (eE/mwc) tends to unity, the cross section of induced radiation begins 
to decrease with the field according to the law 


ind, l 
off p2 


Thus the induced reaction force attains its maximum possible value when 
a= (eE/mew) ~ 1. 

There is a series of applications of the preceding theory in astrophysics, 
which we are unable to discuss within the framework of this book*. 

The examples we have given will enable the reader to see how, in the 
presence of high-intensity fields, even in the simplest system, quite new spe- 
cific phenomena appear — phenomena which are wholly different from those 
that arise in weak fields. 

At the same time the reader will be convinced of the inexhaustibility of 
classical physics and the great power of the methods of quantum theory. 


* See E. Levich and R. Sunaev, Astroph. Lett. 7 (1971) 69. 








SUBJECT INDEX 


Absorbtion energy, 225 
Acceptor, 357 

Activity, 11 
Anisotropic metal, 333 
Antiferromagnet, 286 


Band, energy, 250 

Band width, 251 

Bardeen—Cooper—Schrieffer theory, 341 

Barodiffusion coefficient, 39 

Barrier, thermal transport through, 56 

Basic kinetic equation, derivation of, 58 

— — — in closed system, 64 

— — — in quasi-closed system, 65 

— — —, validity of, 61 

Basis vector, 233 

Bloch wave, 244 

Bogoliubov approximation, 26 

— transformation, 28 

Boltzmann’s equation, 77 

— — and Liouville’s equation, 85 

— —, Chapman—Enskog solution of, 109, 
115 

-- — fora plasma, 166 

— — fora plasma with collisions, 179 

— — for electrons in a solid, 300, 327 

— — for electrons ina solid, solution 
of, 329 

— — for neutrons in matter, 153 

-- — for polyatomic gases, 145 

— — for quasi-gaseous system, 139 

— —, linear, 139 

— —, linear, in external field, 142 

— —, linear, solution of, 142 

— —, linearized, 110 

— —, method of moments solution of, 106 

— —, relaxation approximation of, 124 

Bose—Einstein condensation, 21, 23 


395 


— — — in phonon gas, 379 

— — — temperature, 21 

Bose gas, degenerate ideal, 18 

— —, Hamilton operator of, 25 

— —, imperfect, 24 

Bosons, partition function of ideal 
gas of, 16 

Bragg reflection, 256 

Bravais lattice, 233 

Bremsstrahlung, free—free, 370 

Brightness temperature, 388 

Brillouin zone, 244 

Brownian motion, 56 

Bulk viscosity coefficient, 43 


Chapman—Enskog method, 109, 115 

Charge density in plasma, 263 

Chemical reaction rate, 58 

Classical approximation, quasi-, 14 

Closed system and kinetic equation, 64 

Cohesive energy, 233 

— force, 230 

Collision integral, 74, 76, 78 

— — for electrons in a photon gas, 372 

— — for electrons ina solid, 301, 326, 

332 

— invariant, 92 

— time, 83 

Compton scattering, 368 

Condensation and superfluidity, Bose, 32 

— temperature, Bose—Einstein, 21 

Conduction band, 270 

—, electron, 299 

—, equation of thermal, 46, 49 

Conductivity, Boltzmann’s versus 
correlation function expression for, 
206 

—, electrical, 205, 302, 304, 324 





396 SUBJECT INDEX 


—, —, in light wave field, 311 

—, —, in magnetic field, 310 

-, —, in semiconductor, 362 

—, —, tensor in plasma, 172 
—, thermal, 46 
Conductor, ideal, 305 

Conservation law, energy, 44, 97 

— —, mass, 37, 40, 93 

— —, momentum, 42, 95 

Continuity equation, 38 

Continuous medium approximation, 37 
Convective diffusion, equation of, 40 
Correlation function, 79, 195, 203, 227 
Cooper interaction, 335 

— pair, 339 

Cross-section, differential scattering, 74 
— —, transport scattering, 130 

Crystal, covalent, 233 

—, ionic, 233 

lattice, 232 

— — geometry, 233 

Curie point, 285 

Current, electric, 205 

—, —, in plasma, 171 


Damping of plasma waves, 177, 178 

Debye approximation, 167 

Deformation rate tensor, 116 

De Haas—Van Alphen effect, 280 

Density distribution for particles in 
matter, 166 

— matrix, 5 

— —, time evolution of, 7 

Detailed balance, principle of, 61 

Diamagnetic susceptibility, 280 

Diamagnetism, orbital, 275 

Dielectric constant of quantum plasma, 
261, 264 

— crystal, 351 

— permeability tensor, 173 

— susceptibility, 226 

Diffusion coefficient, 131, 203, 221 

— —, baro-, 39 

— —, generalized, 54 

— —, molecular, 39 

— — of electrons in radiation field, 387 

—, thermal, 133, 135 

—, —, coefficient, 134 


— equation, generalized, 56 

— flux, 38,94 

— of light particles through a gas of 
heavy particles, 126 

Dispersion law, 238 

— of longitudinal plasma waves, 174, 177 

— of transverse plasma waves, 178 

— relation for plasma waves, 267 

—, sound, 136 

Dissipative function, 214 

Distribution function, 73, 80 

— —, basic kinetic equation for, 82 

— —, Chapman—Enskog forms for, 119, 

122 

— —, electron equilibrium, 328 

— —, generalized, 50 

— — in perturbed system, 192 

— —, phonon equilibrium, 328 

Donor, 357 

Dynamic perturbation, 191 


Effective mass, 249, 269 

— wave vector, 243 

Electrical conductivity, 204, 206, 302, 
304, 324 

— — in magnetic field, 310 

— current, 205 

Electron as a quasi-particle, 269 

— conduction, 299 

— distribution function in radiation 

field, 386 

— equilibrium distribution function, 328 

— free path length, 304, 318 

— — — — due to phonon collision, 318, 

321, 324, 333 

— gas, free energy of, 272 

— — and free phonon gas, 368 

— — susceptibility, 280 

— group velocity, 257 

— in crystal, energy spectrum of: 
in tight binding approximation, 249 
in nearly free electron 

approximation, 255 

— — —, Schrodinger equation for, 242 

— — —, wave function of, 248 

— in radiation field, 366, 380 

— — — —, diffusion coefficient for, 387 

— — — -, force on, 383, 386 





SUBJECT INDEX 397 


— interaction, screened, 268 

— pair, 335, 339 

— phonon interaction, 269, 287 

— — — Hamiltonian, 288 

Electrons in photon gas, kinetic equation 
for, 373 

—, number of free, 269 

Electron thermal conductivity, 325 

— velocity in electric field, 258 

Energy band, 250 

— conservation law, 44, 97 

— flux density, 45, 98 

~ gap in superconductor, 341, 344 

Entropy and negative temperature, 69 

— balance equation, 47 

—, definition of, 99 

— flow density, 48 

--, increase of, 99 

—, mean, 200 

— production, 48 

— — in stationary state, 219 

Enskog’s equation, 91 

Equation of state, 11 

— — — for ideal Bose gas, 17 

-= — — for ideal Fermi gas, 17 

Equilibrium distribution, 101 

— —, local, 103 

—, establishment of, 105 

—,— — in ion—electron plasma, 185 

—, local, 36 

—, local statistical, 200 

Euler’s equation, 114 

Exchange integral, sign of, 286 

— interaction, 233, 283 

Exciton, 241, 352 

— energy, large radius, 353 

— —, small radius, 352 

Extrinsic semiconductor, 356 

— —, distribution function of, 359 

— —,n-type, 357 

— —, p-type, 357 


Fermi distribution, 301 

— fluid, 344 

— —, kinetic properties of, 348 

—, quasi-particles in, 346 

—, thermal conductivity of, 348 

—, thermodynamic quantities of, 347 


— —, viscosity of, 348 

Fermions, partition function of ideal 
gas of, 16 

Ferromagnetism, 281 

—, model for, 281 

Feynman diagram, 296 

Fick’s law, 40 

Fluctuation-dissipation theorem, 225 

Flux density in quasi-gaseous system, 142 

—, generalized, 197 

Fokker—Planck equation, 54, 372 

— — —, second form of, 55 

Fowler’s law, 317 

Free path, mean transport, 324 

— — length, electron, 304, 318 

— — —, —, due to phonon collisions, 

321, 324, 333 

Free electron gas and free phonon gas, 
368 

— — — in radiation field, 365 

Free—free bremsstrahlung, 370 

Frenkel exciton, 352 

Fröhlich Hamiltonian, 294 


Generalized transport equation, 91 
Gibbs distribution, 66 

— ~=, local equilibrium, 200 
Group velocity, electron, 257 


Hall constant, 310 

— — in semiconductor, 363 

— effect, 305 

— — in semiconductor, 363 

Heat flux density, 46 
Hermite—Sonine polynomials, 107 
Hole, 345 

—, energy of, 346 

Hydrodynamic time, 83 


Ideal Bose gas, degenerate, 18 

— — -, isotherms of, 23 

— gas, 11,72 

Impurity, 299 

Incompressible medium, 38, 48 
Insulator, 270 

Interaction, effective quantum, 14 
Intrinsic semiconductor, 356 

— —, distribution function of, 357 





i | 


i 
| 
| 





398 SUBJECT INDEX 


Ions, size effect of, 258 

Irreversibility, 62, 89, 195 

Irreversible process, 33 

— —, quasi-macroscopic treatment of, 34 
— —, kinetic approach of, 34 

Isotope effect, 335 


Kinetic coefficient, 37 

— equation, basic, derivation of, 58 

— —, —, for closed system, 64 

— —, —, for quasi-closed system, 65 

— —, —, for the distribution function, 83 

— —, —, validity of, 61 

— — and rotational motion, 146 

— — for electrons in a solid, 300, 327 

— — for electrons in a solid, solution of, 
329 

— — for electrons in photon gas, 371 

— — for photons, 373 

— — for polyatomic gases, 145 

— — with self-consistent field, 169 


Landau damping, 177 

Landau’s kinetic equation, 182, 184 
— — — and the H-theorem, 185 

— — — and the Maxwell distribution, 184 
Langmuir frequency, 177 

Laser, 71 

Lattice, Bravais, 233 

— constants, 234 

--, crystal, 232 

—, reciprocal, 234 

— vibrations, 235 

— with a basis, 234 

Level population, 68 

Liouville’s equation, 80 

— — and Boltzmann’s equation, 85 
Local equilibrium, 36 

— — distribution, 103 

— — — for polyatomic gases, 147 
— Gibbs distribution, 200 

— Statistical equilibrium, 200 
Lorentz method, 127 


Magnetic flux, quantization of, 341 
— susceptibility, 274 
Magnetization, spontaneous, 285 
Magnon, 241, 283 


distribution function, 284 
Magnons, number of, 285 
Markov chain, 51 
Maser, 71 
Mass concentration, 39 
— conservation law, 37, 40, 93 
Matter, motion of particles in, 151 
—, neutron density distribution in, 165 
Maxwellian molecules, 108 
Maxwell—Lorentz equations in a plasma, 
168 
Maxwell’s distribution, 102 
— —, local, 103 
Mean, quantum mechanical, 5 
—, Statistical, 5, 8, 10 
Metal, 270 
—, anisotropic, 333 
—, good, 269 
—, magnetic properties of, 271 
—, optical properties of, 310 
Metalloid, 269 
Microscopic relaxation time, 83 
Mobility vector, 54, 373 
Moderation of neutrons, 151 
Molecular diffusion coefficient, 39 
Moment approximation, thirteen, 108 
Moments solution of Boltzmann’s 
equation, method of, 106 
Momentum conservation law, 42, 95 


Navier—Stokes equation, 44, 120 
Nearly free electron approximation, 
251, 256 
Negative temperature, 69 
— — and entropy, 69 
— — and radiation, 71 
— —, example of system with, 69 
Neutrons in gas of heavy nuclei, energy 
distribution of, 157 
— in hydrogen, energy distribution of, 
155 
— in matter, Boltzmann’s equation for, 
153 
— — —, density distribution of, 165 
— — —, energy distribution of, 153 
—, Spatial distribution of, 161 
— — — with isotropic scattering, 162 
Newtonian fluid, 42 





SUBJECT INDEX 399 


Newton’s law of viscosity, 43 
Nyquist’s formula, 228 


Onsager’s hypothesis, 209 

— relations, 213 

- — and microscopic reversibility, 214 
— theory, 208 
Optical properties of a metal, 310 
Orbital susceptibility, 280 


Paramagnetic gas, thermal conductivity of, 
150 

Paramagnetism, spin, 274 

Particles in matter, density distribution 
of, 166 

— — —, motion of, 151 

Partition function, canonical, 9 

-- —, grand, 10 

— — of ideal gas, 13 

Pauli equation, 275 

Peltier heat, 224 

Perturbation, dynamic, 191 

—, periodic, 194 

—, thermal, 191, 199 

Phonon, 238 

— absorbtion probability, 296 

— annihilation, 290 

— — operator, 239 

— creation, 290 

-- — operator, 239 

— — probability, 295 

— electron interaction, 287 

— — — Hamiltonian, 290 

— equilibrium distribution function, 328 

- momentum, 239 

—, virtual, 297 

Photoelectric current, 316 

— effect, 312 

— —, surface, 313 

— —, volume, 313 

Photon distribution in electron gas, 376 

— gas, Bose—Einstein condensation in, 

379 

— —, cooling of, 377 

— —, heating of, 377, 380 

— —, kinetic equation for, 373 

— spectral distribution, 370 

Planck formula, 67 


Plasma, Boltzmann’s equation for, 166 

—, degenerate quantum, 260 

—, dielectric permeability tensor in, 173 

—, electric conductivity tensor in, 172 

—, electric current in, 171 

—, electron-ion, 185 

—,— —, relaxation times in, 188 

—, Maxwell—Lorentz equations for, 168 

—, self-consistent field, approximation 
for, 168 

—, — —, solution for, 169 

— wave, 267 

— —, damping of, 169, 177, 178 

— —, dispersion of, 169, 174, 178 

— —, longitudinal, 177 

— —, transverse, 178 j 

— with collisions, Boltzmann’s equation 
for, 179 

Plasmon, 241, 267 

Point charge in plasma, field of, 265 

Poisson bracket, 80 

Polarization vector, 238 

Polaron, 241 

Polyatomic gas, kinetic equation for, 145 

— —, thermal conductivity of, 147 

Pressure of ideal Bose gas, 22 

— thermomolecular, 217 

Probabilities, hypothesis of equal, 9 

Probability distribution, 9 

Pseudopotential, 259 


Quantization of magnetic flux, 341 

Quasi-closed system, 4 

— — — and basic kinetic equation, 65 

Quasi-equilibrium, 35 

Quasi-gascous system, 138 | 
— — —, flux density in, 142 
— — —,in external field, 142 
Quasi-momentum, 240 
Quasi-particle, 29, 239, 269 

— — in Fermi fluid, 347 

— — lifetime, 241 


Radiating particle, force on, 380, 383 

Radiation reaction force, in anisotropic 
field, 385 

— — —, induced, 384 

Random phase condition, 8, 61 





d 


400 SUBJECT INDEX 


Reciprocal lattice, 234 

Relaxation, 34 

— approximation of Boltzmann’s 

equation, 123 

— time, microscopic, 83 

Resistance, residual, 305, 325 

Response function, 195 

— ofa system, classical, 193 

— — — —, quantum mechanical, 199 

— toa thermal perturbation, 202 

Rotational motion and kinetic equation, 
146 


Scattering cross-section, differential, 74 

— — —, transport, 130 

Screening, 265 

— and electron interaction, 268 

— length, 265 

Seebeck effect, 224 

Semiconductor, 271, 355 

—, conductivity of, 362 

—, extrinsic, 356 

—, —, distribution function, 359 

—, —, n-type, 357 

—, —, p-type, 357 

—, Hall constant of, 363 

—, Hall effect in, 363 

—, intrinsic, 356 

—, —, distribution function, 357 

Senftleben effect, 149 

Shear viscosity coefficient, 43 

Smoluchowski equation, 51 

Solid state, adiabatic approximation in, 
231 

— — Hamiltonian, 231, 294 

— — model, 230 

— — wave function, 231 

Sound dispersion, 136 

— velocity, 137 

Spectral distribution, phonon, 370 

Spin paramagnetism, 274 

Spin wave, 283 

Stationary non-equilibrium process, 218 

— state, 35 

Statistical matrix, 5 

— operator, 5 

— —, canonical, 10 

— —, grand canonical, 10 


— — in perturbed system, 198 

— —, time evolution of, 7 

Stress, 41 

— tensor, 41,96 

Subsystem, closed, 4 

—, open, 4 

—, quasi-closed, 4 

Superconductivity, 334, 343 
Superconductor in magnetic field, 339 
—, model Hamiltonian for, 341 
Superfluidity, 30 

— and Bose condensation, 32 

—, condition for, 31 

Susceptibility, diamagnetic orbital, 280 
—, generalized complex, 194, 226 

— of electron gas, 280 

—, paramagnetic, 274- 


Temperature, negative, 69 

Thermal conduction, equation of, 46, 49 

— conductivity, 46, 122 

— —, electronic, 325 

— — of diatomic gas, 147 

— — of Fermi fluid, 348 

— — of paramagnetic gas, 149 

— diffusion, 133, 135, 220 

— — coefficient, 134, 222 

— diffusivity, 49 

— perturbation, 191, 199 

— wavelength, 13 

Thermodynamic force, 201, 209, 216 

Thermoelectric effect, 222 

Thermoelectromotive force, 224 

Thermomechanical effect, 218 

Thermomolecular pressure, 217 

Thirteen-moment approximation, 108 

Thomson cross-section, 374, 393 

— formula, 370, 392 

— relation, second, 224 

Tight binding approximation, 245, 251 

Time correlation function, 195, 203, 227 

Transition metals, 281 

Transport coefficient, generalized, 196, 
203 

— —, Onsager expression, 211 

— —, tensor of, 197 

— equation, 49 

— —, generalized, 91 





SUBJECT INDEX 401 


— mean free path, 132, 324 
— scattering cross-section, 130 
— through a barrier, thermal, 56 


Umklapp process, 289 
Unit cell, 233 


Valence electron, 230 

Velocity of electron in electric field, 258 
Virtual phonon, 297 

Viscosity coefficient, bulk, 43 

— —, first, 43 

— — ofan ideal gas, 119 


— —, relaxation time approximation for, 
125 

— —, second, 43 

— —, shear, 43 

— ofa Fermi fluid, 348 


Wannier exciton, 353 
Wave vector, effective, 243 
Wiedemann—Franz law, 325 


Zero point oscillations in a plasma, 268 
Zero sound, 348 
— — velocity, 350 





